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Real Party in Interest 

The real party in interest in this appeal is California Institute of Technology, Pasadena, 
California, which is the assignee of the present application. 

Related Appeals and Interferences 

Appellants are not aware of any related appeals or interferences that will directly affect or 
be directly affected by or have a bearing on the Board's decision in the pending appeal. 

Status of Claims 

The Examiner's Action of March 28, 2001 made final the rejection of claims 1-19 and 
25-26. On July 24, 2001, the Applicants appealed. The status of the claims is as follows: Claim 
2 was canceled in the amendment filed July 24, 2001; Claims 1, 3-19 and 25-26 stand rejected 
under 35 U.S.C. § 103(a) and are the subject of this appeal. Claims 20-24 are allowed. 

Status of Amendments 

Applicants submitted amendments on July 24, 2001. These amendments were entered 
and acknowledged by the Examiner in the Advisory Action mailed August 2, 2001. No 
amendments remain outstanding in the application. 

Summary of The Invention 

The rejected claims relate to polyamide molecules comprising one or more amino acids 
containing N-methylpyrrole, 3-hydroxy-N-methylpyrrole, or N-methylimidazole moieties, and a 
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positive patch consisting of a rigid group adjacent to a positive charged group. The rigid group of 
the polyamide molecule comprises a first amino acid of arginine, proline, lysine, or 
hydroxyproline, and a second amino acid of proline, glycine, serine, threonine, leucine, 
isoleucine, valine, alanine, or hydroxyproline, respectively. Such molecules can provide 
sequence-specific binding within the minor groove of a DNA molecule, while presenting the 
positively charged group in an orientation that allows the polyamide molecule to disrupt 
interactions between proteins and the phosphate backbone or major groove of the DNA 
molecule. 

At the time the instant patent application was filed, polyamide molecules were well 
known in the biological and chemical arts. For example, proteins, which play a critical role in 
virtually all biological processes, were known to the skilled artisan to be polymers of individual 
residues - known as amino acids. There are 20 different amino acid residues that commonly 
make up proteins, which are linked together through amide bonds to form the protein: 

R 1 O H R 2 O H R3 

i ii i i ii i i 

C C N C C N C 

Amide bond 

The compositions of present invention also comprise individual molecular building 
blocks liked by amide bonds, and are thus also referred to by the skilled artisan by the generic 
term "polyamide." However, these molecular building blocks, recited in the claim as being N- 
methylpyrrole, 3-hydroxy-N-methylpyrrole, and N-methylimidazole, are unlike those 
traditionally used by nature to form proteins: 
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HOOC 




N-methylpyrrole 3-OH-N-methylpyrrole N-methylimidazole 

For example, traditional amino acid residues are a-amino acids (i.e., the NH2 group and 
the COOH group that participate in the amide bond are both connected to the same carbon atom), 
while the imidazole- and pyrrole-based residues are not. Also, unlike traditional amino acids, the 
imidazole- and pyrrole-based residues are aromatic structures. The aromatic nature of these 
residues (including the SP 3 nature of the a-carbon) results in a substantially planar structure that 
is unlike that of any of the traditional a-amino acids. 

With rare exceptions, imidazole- and pyrrole-based polyamides are not naturally 
occurring molecules. One example of such naturally occurring imidazole- and pyrrole-based 
polyamides is Distamycin: 
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Distamycin, and other similar naturally occurring polyamides, exhibit antibiotic, antiviral, 
and antitumor activity, mediated by the ability of these polyamides to bind double-stranded 
DNA. One drawback to these polyamides, however, is that, while they bind double-stranded 
DNA, they do so with poor sequence specificity. See, e.g., specification, page 2, lines 1-16. This 
makes targeting such molecules to a specific DNA sequence difficult, if not impossible. 

In contrast, through the use of N-methylpyrrole, 3-hydroxy-N-methylpyrrole, and N- 
methylimidazole residues, the present invention provides improved sequence-specificity of DNA 
binding. Moreover, the instantly claimed polyamides include a "positive patch" consisting of a 
rigid group comprising at least 2 amino acids adjacent to a positively charged group. The 
inclusion of the positively charged group allows for the alteration of the chemical environment 
surrounding the DNA molecule, allowing the polyamide molecule to disrupt interactions between 
proteins and the phosphate backbone or major groove of the DNA molecule. See, e.g., 
specification, page 13, lines 10-17. 

Thus, the present invention may provide antibiotic, antiviral, and antitumor compositions 
that can be specifically targeted to a DNA sequence. These compositions can provide a true 
pharmaceutical "magic bullet;" that is, a composition designed to specifically attack a DNA 
sequence that is mediating a disease, while ignoring non-target DNA sequences in the patient. 

Issues 

1. Whether claims to polyamides that bind in a sequence-specific manner to the 
minor groove of a DNA molecule, wherein the polyamide comprises one or more amino acids 
comprising moieties of N-methylpyrrole, 3-hydroxy-N-methylpyrrole, or N-methylimidazole, and 
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a positive patch consisting of a rigid group comprising at least 2 specified amino acids adjacent 
to a positively charged group, are unpatentable under 35 U.S.C §103 over various cited 
publications, when the skilled artisan would lack both a motivation to combine the references as 
suggested by the Examiner, and a reasonable expectation of success in combining the references 
to arrive at the instantly claimed invention. 

Grouping of Claims 

Claims 1, and 3-1 1 stand or fall together, and claims 12-19 stand or fall together, and 
claims 25-26 stand or fall together. Specifically, Claims 1,3-11, and 25-25 relate to polyamides 
that specifically bind to base pairs in the minor groove, comprising one or more amino acids 
comprising moieties of N-methylpyrrole, 3-hydroxy-N-methylpyrrole, or N-methylimidazole, and 
a positive patch consisting of a rigid group comprising a first and a second amino acid adjacent to 
a positively charged group, the first amino acid being selected from the group consisting of 
arginine, proline, lysine, and hydroxyproline; and the second amino acid being selected from the 
group consisting of proline, glycine, serine, threonine, leucine, isoleucine, valine, alanine, and 
hydroxyproline; claims 12-19 relate to such polyamides, further comprising a symmetrical 
number of carboxamide groups on each side of a hairpin linkage; and claims 25-26 relate to 
methods of inhibiting gene expression using such polyamides. 

The Examiner's Rationale 

The Examiner's rationale for rejecting claims 1, 3-19, and 25-26 is stated in the final 
Office Action mailed on March 28, 2001 ("Paper No. 10"), and the Advisory Action mailed on 
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August 2, 2001 ("Paper No. 15")* The Examiner contends that the claims are unpatentable under 
35 U.S.C. 103(a) over Swalley et al y J. Am. Chem. Soc. 118: 8198-206 (1996), Parks et al. 9 J. 
Am. Chem. Soc. 118: 6153-9 (1996), and Trauger et ai, J. Am. Chem. Soc. 118:6160-6(1996), 
in view of Feng et aL, Science 263: 348-55 (1994). 

Specifically, the Examiner asserts that the Swalley, Parks, and Trauger publications 
disclose polyamide compounds comprising amino acid moieties bearing N-methylimidazole and 
N-methylpyrrole, and N,N-dimethylaminopropyl-amide moieties (Paper No. 10, p. 4). The 
Examiner alleges that proline, arginine, and histidine are chemically and functionally similar to 
N-methylpyrrole, N,N-dimethylamino-propylamide, and N-methylimidazole (Paper No. 10, pp. 
4-5). The Examiner further asserts that the Feng publication discloses a polyamide compound 
that specifically interacts with the minor groove of DNA utilizing the sequence Gly-Arg-Pro-Arg 
at the carboxyl terminal domain, and binds to the major groove involving a helix-turn-helix a- 
helix motif. 

The Examiner contends that it would have allegedly been obvious to modify the 
polyamides of the Swalley, Parks, and Trauger publications with a sequence of amino acids 
comprising Arg-Pro-Arg because "polyamide compounds comprising these sequences are known 
to bind DNA with high affinity in a sequence specific manner." (Paper No. 10, page 5). The 
Examiner does not specifically address the rationale with regard to claims 25-26, which relate to 
inhibiting gene expression using the claimed polyamides. 
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Argument 

Appellants respectfully request that the rejection of claims 1,3-19, and 25-26 be 
withdrawn or reversed, and that the instant claims be permitted to proceed to allowance, because 
the Examiner has failed to establish a prima facie case of obviousness. The skilled artisan would 
lack a motivation to select the three amino acid "arg-pro-arg" sequence from the larger Hin 
recombinase molecule and combine this sequence with the polyamides of the Swalley, Parks, and 
Trauger publications, because arg and pro are neither chemically nor functionally similar to the 
pyrrole- and imidazole-based residues disclosed in the Swalley, Parks, and Trauger publications. 
Moreover, the skilled artisan would not have a reasonable expectation of success in combining 
the three amino acid "arg-pro-arg" sequence with the polyamides of the Swalley, Parks, and 
Trauger publications to provide the instantly claimed polyamides, because these residues in 
isolation do not possess DNA binding characteristics. Instead, the Examiner has fallen victim to 
reconstructing the claimed invention by hindsight. 

Applicable Legal Standard 

To establish a prima facie case of obviousness, three criteria must be met; there must be 
1) some motivation or suggestion, either in the cited publications or in knowledge available to 
one skilled in the art, to modify or combine the cited publications; 2) there must be a reasonable 
expectation of success in combining the publications to achieve the claimed invention; and 3) the 
publications must teach or suggest all of the claim limitations. In re Vaeck, 20 USPQ2d 1438 
(Fed. Cir. 1991); MPEP § 2143. 
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Proline, Histidine, and Arginine are not structurally similar to N-methylpyrrole, N- 
methylimidazole, and N,N dimethylamino-propylamide, as the Examiner alleges 

The Examiner's asserted prima facie case first relies on the incorrect contention that the 
person of ordinary skill would understand that proline and arginine are "chemically and 
functionally similar" to the N-methylpyrrole N-methylimidazole, and N,N- 
dimethylaminopropylamide recited in the claims. See, e.g., Paper No. 10, page 4. Referring to the 
structures below, those skilled in the art would readily recognize that the amino acids proline and 
arginine are not chemically or structurally similar to N-methylpyrrole and N,N- 
dimethylaminopropylamide. For example, proline comprises a saturated heterocyclic moiety, 
whereas an amino acid derived from N-methyl pyrrole is an aromatic molecule, the chemical and 
structural nature of which is entirely different from proline: 



It is well-known, for example that the lone electron pair on the ring nitrogen atom of N- 
methylpyrrole interacts with the double bonds to impart a significant degree of aromatic character 
to N-methylpyrrole. In contrast, aromaticity is clearly not possible with the saturated heterocyclic 
moiety of proline. Such aromatic rings differ significantly from their non-aromatic counterparts 
in many ways (e.g., structurally, physical properties, reactivity). For example, in contrast to 
proline, the N-methylpyrrole structure is a substantially planar molecule. The skilled artisan 





proline residue 
in a polyamide 



CH 3 

N-methyl pyrrole 
residue in a polyamide 
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would understand that this difference in structure would in significantly different binding 
interactions with a target DNA molecule. 

Additionally, while proline in a polyamide is an a-amino acid linked through the ring 
nitrogen, N-methylpyrrole is not an a-amino acid and contains a methyl group at the ring 
nitrogen. The skilled artisan would understand that any contacts made by proline with a DNA 
molecule would be completely different than those made by N-methylpyrrole. It is precisely these 
very different molecular interactions with DNA, provided by the pyrrole- and imidazole-based 
residues of the present invention when compared to naturally occurring proteins, that provide the 
advantageous DNA binding characteristics of the instantly claimed polyamides. 

Similarly, referring to the structures below, it is clear that arginine is not similar to N,N- 
dimethylaminopropylamide. 




H 2 N 

arginine residue in a polyamide N,N- dime thy laminopropyl amide moiety 

First, the guanidine moiety which is present in arginine is conspicuously absent in N,N- 
dimethylaminopropylamide. Second, in contrast to N,N-dimethylaminopropylamide, arginine 
contains several basic nitrogen atoms. Clearly, when these molecules are incorporated into a 
polyamide, significantly different three dimensional structures, exhibiting significantly different 

12 



PATENT 

025098-1406 (Formerly 238/298) 



interactions with a target DNA molecule, will result. Accordingly, the Examiner's assertion that 
arginine and N,N-dimethylaminopropylamide are chemically similar is respectfully submitted to 
be in error. 

Finally, the referring to the structures below, it is equally clear that the skilled artisan 



Thus, given the number of variables that must be selected and modified, the nature and 
significance of the differences between the various molecules, and the absence of any scientific 
reason to believe the individual moieties would function in an similar manner to one another, 
there is no motivation for the skilled artisan to incorporate Arg-Pro-Arg from the Feng et al 
publication into the polyamide compounds of Swalley, Parks, and Trauger. See, e.g., In re Jones, 
21 USPQ2d 1941, 1943 (Fed. Cir. 1992) (in an obviousness determination, the number of 
variables that must be selected or modified and the nature and significance of the differences 
should be considered); See also, MPEP § 2144.08(II)(A)(4) ("The prior art, and not the 
applicant's disclosure, must provide one of ordinary skill in the art the motivation to make the 



would also understand that histidine is neither functionally nor structurally similar to N- 
methylimidazole, as the Examiner asserts: 




histidine residue 
in a polyamide 
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proposed molecular modifications needed to arrive at the claimed compound"). Therefore, the 
Examiner's rejection cannot be maintained on the basis of "structural similarity." 

Proline, Histidine, and Arginine are also not functionally similar to N-methylpyrrole, N~ 
methylimidazole, and N,N dimethylamino-propylamide, as the Examiner alleges 

In arguing that proline and arginine are functionally similar to the N-methylpyrrole N- 
methylimidazole, and N,N-dimethylaminopropylamide recited in the claims, the Examiner 
contends that the Feng et al. publication "describefs] polyamides comprising the residues: Gly, 
Arg, Pro, Arg, wherein said polyamides can be used to bind the minor groove of DNA in a 
sequence specific manner." Paper No. 15, page 2. The Examiner alleges that, because the small 
polyamides disclosed in the Swalley, Parks, and Trauger publications also bind to DNA in a 
sequence specific manner, the skilled artisan would combine the Arg-Pro-Arg sequence with the 
small polyamides. Appellants respectfully submit that this is incorrect. 

Unlike the small polyamides disclosed in the Swalley, Parks, and Trauger publications, 
nothing of record indicates that the sequence Arg-Pro-Arg in isolation can bind DNA at all. The 
Examiner has arrived at this proposed substitution only by selectively choosing the sequence 
Arg-Pro-Arg from the much larger Hin recombinase molecule disclosed in the Feng et al. 
publication, while ignoring passages in the Feng et al. publication that indicate the fallacy of this 
choice. 

For example, the Feng et al. publication discloses that the amino acid sequence Glyi39- 
Argi4o-Proi4i-Argt42 is located at the amino-terminal arm of the Hin DNA-binding domain. Feng 
provides no information about the Arg-Pro-Arg sequence selected in isolation by the Examiner, 
but does state that the Gly residue in the sequence Glyi39-Argi4o is essential for sequence specific 
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DNA binding by Hin recombinase. See, Feng et al., page 348, column 3. Thus, the Feng et al. 
publication teaches away from removal of the Glyn9 residue and selecting Arg-Pro-Arg, as the 
Examiner is proposing. 

Moreover, the Feng et al publication also states that a-helix 3, which is located at the 
carboxyl-terminal end of the DNA binding domain, is the DNA recognition helix for the Hin 
protein (Feng, p. 350, left column; Fig. 1, p. 348), that the Glyi39-Argi4o-Proi4i-Argi42 sequence 
adopts an extended conformation to present the 4 amino acid sequence in a proper manner to 
bind DNA. {See, Feng et al., page 351, columns 2-3), and that Ilei44 is critical in maintaining this 
orientation. Id. at column 3. In short, the authors conclude that DNA binding by Hin 
recombinase is a function of the entire 3-dimensional structure of the protein, stating that 
"[sjpecific binding requires both major groove interactions involving a helix-turn-helix (HTH) a- 
helix motif and minor groove interactions involving the sequence Glyi39-Argi 4 o-Proi 4 i-Argi 4 2." 
Id. at 348, right column. See, W.L. Gore & Associates, Inc. v. Garlock, Inc., 220 USPQ 303 (Fed. 
Cir. 1983), cert, denied 469 U.S. 851 (1984) (a prior art reference must be considered in it 
entirety, including teachings that would lead away from the claimed invention). 

Thus, the Feng et al. publication makes it clear that it is only in the overall three- 
dimensional structure of Hin recombinase that the Gly-Arg sequence has any role in binding 
DNA. There is no indication in the cited references that the three amino acid sequence selected 
by the Examiner is in any way functionally similar to the N-methylpyrrole N-methylimidazole, 
and N,N-dimethylaminopropylamide recited in the claims, as the Examiner contends. Moreover, 
since the Feng et al. publication makes no statements about the Arg-Pro-Arg sequence in 
isolation, there is no motivation provided by the references or art to make the selections and 
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perform the steps the Examiner has suggested for grafting these sequences onto the polyamides 
disclosed in Swalley et al.\ Parks et al. and Trauger et al. 

The skilled artisan would lack a reasonable expectation of success in grafting an Arg-Pro-Arg 
sequence to a polyamide molecule to provide the instantly claimed molecules 

Because it is only in the context of the 3-dimensional structure of Hin recombinase that 
the Arg-Pro-Arg sequence selected by the Examiner participates in DNA binding, the skilled 
artisan would lack a reasonable expectation that simply grafting these amino acids onto the 
polaymides disclosed in the Swalley, Parks, and Trauger publications to provide the instantly 
claimed molecules. 

In maintaining the rejection, the Examiner contends that "Feng et al. teach that sequence 
Gly- Arg-Pro-Arg is necessary for minor groove binding to DNA." Paper No. 15, page 3, 
emphasis added. This is a very different thing, however, from a teaching that the sequence is 
sufficient for such DNA binding, particularly since the Feng et al. publication makes it clear that 
it is only in the overall three-dimensional structure of Hin recombinase that the Gly-Arg 
sequence has any role in binding DNA. Thus, unless the skilled artisan understood that the Arg- 
Pro-Arg sequence was sufficient to provide DNA binding, there would be no reasonable 
expectation that selecting these three amino acids in isolation from a much larger protein having 
numerous other amino acids that are also necessary for sequence specific binding would be of 
any use at all to the small polyamides disclosed in the Swalley, Parks, and Trauger publications. 
And the Feng et aL publication makes it very clear that the Arg-Pro-Arg sequence is not 
sufficient to provide DNA binding. 
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Similarly, the Examiner argues that "Feng et al. teach the presence of the Arg-Pro-Arg 
sequence in Drosophila Engrailed is also responsible for minor groove binding." Paper No. 15, 
page 3. Here, the Examiner mischaracterizes the teachings of the Feng et al publication, which 
demonstrates that, just as for Hin recombinase, the Arg-Pro-Arg sequence in Drosophila 
engrailed would not bind to DNA in the absence of numerous other engrailed structures. For 
example, in the very same sentence referred to by the Examiner, the Feng et al publication 
indicates that an adjacent threonine residue in engrailed also interacts with the DNA by 
contacting the phosphate backbone. See, Feng et al, page 355, left column. Additionally, DNA 
binding by engrailed requires contacts with the major groove that are very similar to those of Hin 
recombinase. See, e.g., Feng et al, Figure 10. Thus, just as in the case of Hin recombinase, the 
Feng et al. publication makes it clear that it is only in the overall three-dimensional structure of 
engrailed that the Arg-Pro-Arg sequence has any role in binding DNA. 

Moreover, an enormous number of other proteins also include the Arg-Pro-Arg sequence 
selected by the Examiner. An internet search of the well known Protein Information Resource 
. (PIR) database reveals that the sequence Arg-Pro-Arg is present in 1 1,003 proteins in the 
database, and nothing indicates that each of these "polyamides with a sequence comprising Arg- 
Pro-Arg. . . are known to bind DNA with a high affinity in a sequence specific manner." (Paper 
No. 10, page 5). The search results and printouts of the relevant web pages with the first 100 
proteins obtained are attached as Appendix B. 

Viewed in the context of the enormous number of proteins containing the sequence Arg- 
Pro-Arg, the person of ordinary skill in the art would understand that it is only in the context of 
the entire Hin recombinase 3-dimensional structure that the sequence would have any role in 
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"[binding] DNA with a high affinity in a sequence specific manner." This is also confirmed by 
the Feng et al. publication, which, as discussed above, states that other parts of the Hin 
recombinase protein are required for sequence specific DNA binding. Furthermore, the Feng et 
al. publication states that "[a] large body of footprinting, mutation, and chemical derivatization 
data has indicated [the specific binding] features of Hin-DNA interaction... are distinctive to 
prokaryotic DNA binding proteins (emphasis added) (Feng, p. 348, right column), and not 
generic to the short Arg-Pro-Arg sequence. Therefore, the person of ordinary skill in the art, 
upon reviewing Feng et al, would understand that the sequence Arg-Pro-Arg would not be 
suitable for use in isolation as some sort of DNA-binding motif. 

Thus, because the Feng et al completely fails to teach or suggest that this sequence has 
any general applicability to DNA binding, or any applicability outside of the specific Hin 
recombinase, there is no reasonable expectation of success in making the combination proposed 
by the Examiner. 

The skilled artisan would not seek to graft an Arg-Pro-Arg sequence to a polyamide 
molecule to provide the "positive patch " of the instant claims 

Furthermore, the Examiner's arguments in support of the asserted prima facie case relies 
on the contention that the skilled artisan would be allegedly motivated to graft an Arg-Pro-Arg 
sequence obtained from the Feng et al. publication onto the polyamides disclosed in Swalley et 
al.\ Parks et al. and Trauger et al, to provide the "positive patch" of the instant claims, because 
of an asserted "similarity" between arg and pro, and the pyrrole- and imidazole-based residues 
disclosed in the Swalley, Parks, and Trauger publications. But the Examiner ignores the fact that 
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the "positive patch" of the instant claims does not serve a minor groove binding function, as the 
Examiner apparently believes. 

For example, the inclusion of the positively charged group is intended to disrupt 
interactions between proteins and the phosphate backbone or major groove of the DNA 
molecule, and not to bind to the minor groove. See, e.g., specification, page 14, lines 20-29. 
There is nothing of record in the asserted prima facie case to indicate that the skilled artisan 
would be motivated to select an Arg-Pro-Arg sequence, as disclosed in the Feng et al 
publication, in order to disrupt interactions between proteins and the phosphate backbone or 
major groove of the DNA molecule. Furthermore, because nothing of record indicates that an 
Arg-Pro-Arg sequence would serve such a function, the skilled artisan would lack a reasonable 
expectation of success in combining the references as suggested by the Examiner to provide the 
instantly claimed polyamide molecules. 

The Examiner Has Used Impermissible Hindsight in Making Selections and Combinations 

When the claims are properly interpreted, it is apparent that neither the references cited by 
the Examiner nor the knowledge available to one of ordinary skill in the art provide a motivation 
for making the proposed combination, or a reasonable expectation of success in doing so. 
Applicants respectfully submit that, instead of carrying the burden of establishing a prima facie 
case of obviousness, the Examiner has fallen victim to . .decomposing an invention into its 
constituent elements, finding each element in the prior art, and then claiming that it is easy to 
reassemble these elements into the invention...". In re Mahurkar, 28 USPQ2d 1801, 1817 (N.D. 
111. 1993). An obviousness determination cannot be premised on such an impermissible use of 
hindsight. See f In refine, 5 USPQ2d 1596, 1600 (Fed. Cir. 1988) ("To imbue one of ordinary 
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skill in the art with knowledge of the invention in suit, when no prior art reference or references 
of record convey or suggest that knowledge, is to fall victim to the insidious effect of a hindsight 
syndrome wherein that which only the inventor taught is used against the teacher.") 



For the reasons discussed above, the Examiner has failed to carry the burden of 
establishing a prima facie case of obviousness. Appellants respectfully submit the instant claims 
are in condition for allowance, and respectfully request that the rejections be withdrawn or 
reversed, and that the rejected claims, together with the claims previously indicated as allowable, 
be allowed to issue. 



Conclusion 



Respectfully submitted, 
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Recognition of a 5'-(A,T)GGG(A,T) 2 -3' Sequence in the Minor 
Groove of DNA by an Eight-Ring Hairpin Polyamide 
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Abstract: The use at* pyrrole— imidazole polyamides for the recognition of core 5'-GGG-3' sequences in the minor 
groove of double stranded DNA is described. Two hairpin pyrade— imidazole polyamides, hnlmlm-y-PyPyPy-/?- 
Dp and ImlmhnPy-y-PyPyPyPy-^-Dp (lin = /V-mediyliinidazole-2-carboxainide, Py = N* me thy I pyrrole -2- 
carboxamide, fi = /5-alanine, y = y-aminobutyric acid, and Op = ((dime thylamino)p ropy l)amide), as well as the 
corresponding EDTA affinity cleaving derivatives, were synthesized and their DNA binding properties analyzed. 
Quantitative DNase I footprint titrations demonstrate that linhnIm-y-PyPyPy-£-E)p binds die formal match sequence 
5'-AGGGA-3' with- an equilibrium association constant of = 5 x 10 6 M~ l (10 mM Tris-HCl, 10 mM KCl, 10 
mM MgCl* and 5 mM CaCl 2 , pH 7.0 and 22 °C). lmImIm?y-y-PyPyPyPy-/?-Dp binds the same site, 5'-AGGGAA- 
3', approximately two orders of magnitude more tightly than the six ring polyamide, with an equilibrium association 
constant of = 4 x I0 a M~'. The eight-ring hairpin polyamide demonstrates greater specifity for single base pair 
mismatches than does the six ring hairpin. Polyamides with an EDTA*Fe(II) moiety at the carboxy terminus confirm 
that each hairpin binds in a single orientation. The high affinity recognition of a 5'-GGG-3' core sequence by an 
eight ring polyamide containing three contiguous imidazole amino acids demonstrates the versatility of pyrrole - 
imidazole polyamides and broadens the sequence repertoire for DNA recognition. 



Introduction 

Pyrrole -imidazole polyamide— DNA complexes afford a 
general method for die design of non-natural molecules for 
sequence -specific recognition in the minor groove of DNA. 1 " 5 
Polyamides containing /V-mediylpyrrole (Py) and /V-methyliini- 
dazole (lm) carboxainides bind to the minor groove as side- 
by-side antipara del doners and are capable of specific recog- 
nition of sequences containing G,C base pairs, where die N3 
of each imidazole forms a hydrogen bond with a single guanine 
exocyclic amino group. 1 The side-by-side pairing of an 
imidazole ring from one polyamide with a pyrrole ring from 
the second polyamide recognizes a G'C base pair, whUe a 
pyrrole— imidazole combination recognizes a C'G base pair. 1 
Finally, a pyrrole-pyrrole pair recognizes eidier an A'T or T- A 
base pair. 1,2 By employing the 2:1 model, specific recognition 
of the sequences 5'-(A,T)G(A J)C(A,T)-3'. 5'-(A,T)G(A J) 3 - 
3'. 5'-(A,T) 2 G(A,T) r 3\ and 5'-(A,T)GCGC(A ,T)-3' has been 
achieved. 1 " 5 

Covalent head-to-tail linkage of two polyamides by y-ami- 
uobutyric acid (y) to form a "hairpin" polyamide results in bodi 
increased affinity and specificity, as compared to die iuU inked 



* Abscracc published in Advunce ACS Abstracts; August 15, 1996. 

(1) (a i Wade. W. S.: Mrksiclu M; Dervatu P. B. J. Am, Chem. Sac. 
1992. //</, S7R3. (b) Mrksicli, M.; Wade, W. S.; Dwyer, T. (jeierstajigcr, 
B. H.; Wcmincr. D. E.: Dervan, P. B. Prac. Natl. Acad. Set. U.S.A. 1992, 
89, 7586. (c) Wade. W. S.; Mrksich, M.; Oervan. P. B. Biochemistry 1993, 
J*\ 11385. 

(2) (a) Pelion. J. G.; Wemmer, D. E. Proc. Nati. Acad. Set. U.S.A. 1989. 
SA t 5723. (bt Pelton, J. C; Wenuncr, D. E. ./. Am. Chew. S»c. 1990, 112, 
1393. (c) Clien. X.; Ramnkrishnan, B.; Rao, S. T.; Sundamlingham, M. 
Struct. Biol. Nature 1994. /, 169. 

(3) (a) Mrksich. M.; OervaQ, P. B. / Am. Chem, Soc. 1993, J IS, 2372. 
lb i Geierstanijer, B. H.; Jacobsco, J.-P.; Mrfcsich, M.; Dervan, P. B.; 
Wemmer, D. E. Biochemistry 1 994,. J J, 3055. 

(4 1 Geierstauger. B. H.; Dwyer. T. J.; Bauiini, Y.; Lown, J. W.; Wemmer, 
D. E. J. Am. Chem. Soc. 1993, 11 S, 4474. 

(5) (a) Gcicrsianger, B. R; iMrksich, M.; Dervan, P. B,; Wemmer, D. E. 
Science 1994. 266, 646. (b) Mrlcsich, M.; Dervan, P. B. J. Am. Chem, Soc 
1995, 117, 3325. 



polyamides. 5 For instance, the 1:1 hairpin motif has been used 
to recognize 5'-(A,T)G(A,T) 3 -3' by IniPyPy-y-PyPyPy-Dp with 
approximately 300- fold greater affinity than the unlinked 
polyamides, ImPyPy and PyPyPy. 6 A C-tenninal ^-alanine 
residue increases bodi affinity and specificity and facilitates solid 
phase synthesis, as recently demonstrated with ImPyPy-y- 
PyPyPy-£-Dp. 7,s Furthermore, a sequence containing two 
contiguous G*C base pairs, 5'-(A,T)GG(A,T) 2 -3' t has been 
recognized by ImlmPy-y-PyPyPy-jS-Dp. 9 

To further expand the sequence repertoire available with the 
hairpin motif, two polyamides containing three contiguous 
' imidazole rings, Ltnlralm-y-PyPyPy-^-Dp (1) and LnlmlmPy-. 
y-PyPyPyPy-y?-Dp (2), and the corresponding affinity cleaving 
analogs, Imlmlm-y-PyPyPy-^-Dp-EDTA (1-E) and ImlmlmPy- 
y-PyPyPyPy-^-Dp-EDTA (2-E), were synthesized using solid 
phase synthetic protocols (Figures 1 aud 2). 7 Specific hydrogen 
bonds are expected to form between each imidazole N3 and 
one of die tiiree individual guanine 2 -amino groups on the floor 
of die minor groove (Figure I). The eight-ring hairpin poly- 
amide, with a pyrrole between die C-tenninal imidazole and 
the y -linker, was syndiesized to examine whedier die positioning 
of die final imidazole immediately adjacent to the turn would 
adversely affect binding affinity or specificity. We report here 
the affinities and relative" specificities of diese tris-imidazole 
polyamides as determined by three separate techniques: MPE* 
Fe(II) footprintuig, 10 DNase [ foo (printing, 11 and affinity cleav- 
nig. 12 Information about binding site size is gained .from MPE- 
Fe(Il) footprinting, while quantitative DNase I footprint titrations 

(6) Mrksica. M.; Parks, M. E.; Dervan, P. B. J. Am. Chem. Soc, 1994, 
116, 7983. 

(7) Baird, E. E.; Dervan,. P. B. J. Am. Chem. Snc. 19.96, US, 6U1, 

(8) Paries, M. E.; 8aird. E. E.; Dervan, P. B. / Am. Chem. Soc. \996 t 
118, 6147. 

(9) Parks, M. E.; Baird, E. E.; Dervan, P. B. J. Am. Chem. Soc. 1996, 
118, 6153. 

(101 (a) Van Dyke, M. W.; Dervan, P. B. Biochemistry 1983, 22, 1373. 
(b) Van Dyke, M. W.; Dervan, P. B. Nucleic Acids lies. 1983, It, 5555. 
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Results 



Imlmlm-y-PyPyPy-p-Dp 
(A f T)GGG(A,T) 



5 * -W G G G W-3 
3 ' -W C C C W-5 




ImJmlmPy-y-PyPyPyPy.p-Dp • 
(A,T)GGG(A,T)(A,T) 

5* -W G G G W W-3 * 

3 * -W C C C W W-5 ' 

Figure 1. Binding model for the complexes formed between the ON A 
and either die six-ring hairpin polyamide fmTinrm-y-PyPyPy-0-Dp (A) 
or the eight-ring hairpin polyamide lmrmIiTiPy-y-PyPyPyPy-/?-Dp (B). 
Circles with dots represent lone pairs of N3 of purines and 02 of 
pyrimidiiies. Circles containing an H represent die N2 hydrogen of 
guanine. Putative hydrogen bonds are illustrated by dotted lilies. Ball 
and stick models are also shown. Shaded and nous haded circles denote 
imidazole and pyrrole carboxamides, respectively. Nons haded diamonds 
represeut the ^-alanine residue. W represents either an A or T base. 

allow the determination of equilibrium association constauts (/f a ) 
for the poly amides to a variety of match and single base pair 
mismatch sequences. Affinity cleavage studies confirm the 
binding orientation and stoichiometry of the 1:1 hairpin:DNA 
complex. 



Synthesis of Polyamides. The polyamides Imlmlm-y- 
PyPyPy-/?-Dp (1) and unlmimPy-y-PyPyPyPy-/?-Dp (2) were 
synthesized in a stepwise manner from Boc-/?-alanine-Pam resin 
using recently described Boo -chemistry protocols in 14 and 18 
steps, respectively. 7 The polyamides were then cleaved by a 
single step aminolysis reaction with ((dime thy lamino)propyl)- 
ainine and subsequently purified by HPLC chromatography. The 
syntheses of the polyamides, lmImIinPy-y-PyPyPyPy-£-Dp (2), 
[mImiinPy-y-PyPyPyPy-/?-Dp-NH 2 (2-NHi), and ImlmlmPy- 
y-PyPyPyPy-£-Dp-EDTA (2-E), are outlined (Figure 3). For 
the synthesis of the EDTA analog, a sample of resin is treated 
with 3,3 / -diaminp-^/'-methyldipropylamine (55 °C) and then 
purified by preparatory HPLC to provide 2-NHv The poly- 
amide lmimiinPy-y-PyPyPyPy-/?-Dp-NH : (2-iNHi) provides a 
free aliphatic primary amine group suitable for modification. 
This polyamide -amine is then treated with an excess of the 
dianhydride of EDTA ..(DMSO/NMP, DIE A, 55 °C) and the 
remaining anhydride hydrolyzed (0.1 M NaOH, 55 °C).. The 
EDTA modified polyamide is then isolated by preparatory 
. HPLC 

Identification of Binding Site Size and Orientation by 
(VIPE'Fe(II) Footprinting and Affinity Cleaving. MPE*Fe- 
(II) footprinting on the 3'- and 5'- 32 P end-labeled 274 base pair 
EcoRUPuull restriction fragment from plasmid pSESl (25 mM 
Tris-acetate, 10 mM NaCI, 100 /fM calf thymus DNA, pH 7.0 , 
and 22 °C) reveals that the polyamides, each at 10 /*M 
concentration, are binding to the four designed sites, 5'- 
AGGGA(A)-3', 5'-AGGCA(A)-3', 5'-TGGGT(C)-3', and 5'- 
TGGGC(T)-3' (Figures 4 and 5). No footprinting is seen with 
either compound at the single base pair mismatch site, 5'- 
AGGAA(A)-3' (Figure 4, quantitated data not shown). The 
footprinting patterns for the six-ring polyamide arc consistent 
with five base pair binding sites, while the patterns for the eight- 
ring polyamide are indicative of six base pair binding sites. 
Affinity cleavage experiments on the same restriction fragment 
(25 mM Tris— acetate, 200 mM NaCI, 50 /tg/mL glycogen, pH 
7.0 and 22 °C) by the six-ring and eight-ring EDTA*Fe(II) 
analogs reveal cleavage patterns that are 3 '-shifted and appear 
on only one side of each binding site (Figures 4 and 5). 
tmIrrUm-y-PyPyPy-£-Dp-EDTA-Fe(II), at 1 ,uM, and Imlm- 
rmPy-y-PyPyPyPy-/?-Dp-EDTA-Fe(TO, at 100 nM, show cleav- 
age patterns that demonstrate recognition of the same binding 
sites identified by MP£*Fe(n) footprinting. No carrier DNA 
was used in these experiments, and thus the relative cleavage 
intensities indicate that the six-ring polyamide binds most 
strongly to the two match sites 5'-AGGGA-3' and 5'-TGGGT- 
3'; followed by the end mismatch 5'-TGGGCO'. The core 
mismatch 5'-AGGCA-3', with little appreciable cleavage at I 
/<M concentration, is bound more, weakly. Similarly, the eight- 
ring polyamide binds most strongly at 100 nM to 5'-AGGGAA- 
3', much less strongly to 5'-TGGGTC-3', and not significantly 
to 5'-TGGGCT-3' and 5'-AGGCAA-3'. 

Determination of Binding Affinities by Quantitative DNase 
I Footprinting. Quantitative DNase I footprint titration experi- 
ments (10 mM Tris-HCI, 10 mM KCU 10 mM MgCt 2 , and 5 
mM CaClz, pH 7.0 and 22 °C) were performed to determine 

(11) (a) Brenowitz, M.; Senear, D. F.; Shea, M. A.; Ackers, G. tC 
Methods Enzymof. 1986, 130, 132. (b) Brenowitz. M.; Senear, D. F.; Shea, 
M. A.; Ackers, G. K. Proe. Natl Acad. Sci. U.S..1. 1986, 33, 8462. (c) 
Senear, O. F.; Brenowitz, M.; Shea, M. A.; Ackers, G. K. 8iochemistrv 
1986, >J, 7344. 

(12) (a) Schultz, P. C; Taylor, J. S.; Dervan. P. B. J. Am. Chem. Soc. 
1982, J04 t 6861. (b) Schultz, P. G.; Dctvoa, P. B.J. Biomol. Struct. Dyn. 
1984, /, 1 133. (c) Taylor, J. S.; Schultz, P. B.; Dervaii, P. 8. TcrahcUron 
1984, 40 s 457. 
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1-E ImImIm-y-PyPyPy-P-Dp-EDTA*Fe(n) 




2-E ImlmTmPy-Y-PyPyPyPy-p-Dp-EDTA-Fe(II) 



Figure 2. Structures of die rris-imidazole polyamides and their EDTA derivatives synthesized using solid-phase methodology. 7 



the equilibrium association constants of the polyamides tor the • 
four bound sites (Table I). 11 Imlmlm-y-FyPyPy-ff-Dp binds 
the two match sites 5'-AGGGA-3' and 5'-TGGGT-3' with 

. equilibrium association constants of Ki = 4.6 x 10 6 and 7.6 x 
10 6 M~', respectively (Figures 6—8). The sequence 5'-TGGGC- 
3\ which lias a mismatch in the non-core final position (an "end" 
mismatch), is bound with 6-fold lower -affinity than the best 
match, while the core mismatch sequence 5'-AGGCA-3' is 
bound with 9-fold tower affinity. TmImIrnPy-y-PyPyPyPy-£- 
Dp binds the match site 5'-AGGGAA-3' with an equilibrium 

' association constant of = 3.7 x 1 0 s M' 1 . The end mismatch 
5'-TGGGTC-3' is bound with 26-fold lower affinity, and the 
two core-mismatches, 5'-AGGCAA-3' and 5'-TGGGCT-3\ are 
bound widi 130-fold and 220-fold lower affinity, respectively. 
Neither polyamide shows any appreciable binding to the core 
mismatch 5'-AGGAA(A)-3' (data not shown). 

Discussion 

The two hairpin polyamides recognize the targeted 5'- 
AGGGA(A)-3' sequence, as determined by MPE-Fe(II) foot- 
printing and affinity cleaving, demonstrating specific recognition 
by polyamides of sequences containing three contiguous G-C 
base pairs, 5'-GGG-3'. Affinity cleavage data indicate that each 
polyamide is binding in die minor groove in a single orientation, 
consistent with the hairpin binding model (Figures 1 and 5). 
The relative intensities of the cleavage patterns are consistent 
with quantitative DNase f footprint titration results. 

Quantitative DNase I footprint titrations reveal tbat Imlmlm- 
y-PyPyPy-/J-Dp binds the designed match sites 5'-AGGGA-3' 
and 5'-TGGGT-3' with equilibrium association constants of 
= 5 x 10 6 and 8 x I0 6 M" 1 , respectively. For comparison, 
rhe analogous six -ring hairpins containing only one. and two 
contiguous imidazoles, ImPyPy-y-PyPyPy-£-Dp and fmimPy- 
y-PyPyPy-£-Dp, biud their respective match sites with affinities 
of /C„ ^ 10" M" 1 . 8 - 9 The addition of the third imidazole reduces 
binding affinity signiflcandy, perhaps due to die inability of 
the polyamide to sit deeply in the sterically crowded minor 
groove, decreasing energetically favorable van der Waals 
contacts. Examination bf the eight-ring hairpin, ImfmfmPy-y- 
PyPyPyPy-0-Dp, reveals a dramatically increased affinity, the 
match site 5'-AGGGAA-3' being bound widi an equilibrium 
association constant of = 4 x I0 8 M" 1 . The 80-fold increase 
in affinity in expanding from a three-ring to a four-ring hairpin 
polyamide mirrors the 66- fold enhancement of ImPyPyPy-Dp 



over ImPyPy-Dp that was observed in the 2:1 homodimeric 
polyamide:DNA motif.' 3 

In addition to die observation that die eight-ring ImlmlmPy- 
y-PyPyPyPy-^-Dp binds with higher affinity dian the six ring 
ImlmJm-y-PyPyPy-^-Dp, the enhanced specificity is perhaps 
more significant. The six-ring hairpin polyamide binds 5'- 
TGGGC-3', an end mismatch site, with 6- fold lower affinity 
compared to S'-TGGG T-3' (the highest affinity match), while 
the eight-ring hairpin polyamide binds its end mismatch site 
5'-TGGGTC-3' with 26-fold lower affinity compared to its 
match site 5'-AGGGAA-3'. Similarly, 5'-AGGCA-3', a site 
containing a single base pair core mismatch, is bound by the 
six-ring system with 9-fold lower affinity, while the two core 
mismatch sites for the eight- ring system, 5'-AGGCAA-3' and 
5'-TGGGCT-3', are recognized with 130-fold and 220-fold 
lower affinity, respectively. For both polyamides, binding of a 
site with a core single base pair mismatch results in a greater 
energetic penalty than binding of a site with single base pair 
niismatch in the end position. The increased specificity of 
InUmImPy-y-PyPyPyPy-/?-Dp over ImJmlm-y-PyPyPy-^-Dp 
may indicate that an imidazole placed immediately to the 
N-terminal side of the y turn does not form as strong a hydrogen 
bond as in other positions. 

Implications for the Design of Minor Groove Binding 
Molecules, Pyrrole- imidazole polyamides have been used to 
recognize a variety of target sequences containing A*T and G*C 
base pairs. 1 ' 2 * 4,5 * 9 By recognizing sequences containing three 
contiguous G*C base pairs, 5'-(A,T)GGG(A,T)-3' and 5'-(A,T)- 
GGG(A t T)i-3', this work expands the sequence-composition 
repertoire targetable by the hairpin polyamide motif Both 
affinity and specificity for a G,C rich sequence are increased 
by die use of an eight-ring hairpin polyamide. This ability to 
enlarge the sequence repertoire, combined with rapid solid-phase 
synthesis, brings us one step closer to a universal approach for 
the recognition of any desired DNA sequence by strictly 
chemical methods. 

Experimental Section 

Dicyclohexylcarbodiimide (DCC), hydroxybenzotriazole fHOBt), 
2-(l tf-benzotriazole-l-yO- 1 ,1 ,3,3-tetramethyluronium hexafluorophos- 
phate (KBTU), and 0.2 mmol/g of Boc-£-alanine— (4-carboxamido- 
methyl) benzyl ester— copoly(sryrene— divinylbenzene) resin (Boc-£-. 
Pam-Resin) were purchased from Peptides Intcrnarioaal. /V,iy-Diiso- 

(13 » Kelly, J. J.; Baird, E. E.; Dervan, P. B. /Voc. NutL Acad. Set USA. 
1996, «, 6981. 
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I 6 6 
ImImlmPy-T-PyPyI , yPy-P-Dp- EDTA (2 ' E> 

Figure 3.' (Box, Pyrrole and imidazole monomers for synthe s is of ^S^ttS^I^SSXS 
f/and uUazo.e-2-carboxylic acid 7.- Solid-phase synthase ^J^^^™^^ mmol y g ) : (i) 80% TFA/DCM , 0.4 M 
ImLnnuPy-y-PyPyPyPy^Do-EDTA prepared «™ e «f'* p^s^ v , BocPy-OBt, DTEA. DMF; (v) 80% TFA/DCM, 0.4 M PUSH; (vO 
PhSH- (ii) BocPy-OBt, DrEA. DMF; (...) 80% TFA/DCM, 0.4 M PhSH (w) 1 p . p , 80% TFA/DCM. 0.4 M PhSH; (x) Boc-y- 
BocpO-OBc DltA. DMF; (vii) 80% TFA/DCM. 0.4 M PhSH: (v...) S Bo.Py-OBt DIEA. DMF; (xiii) 80% TFA/DCM. 0.4 M 

alobucyric acid (HBTU, DIEA). DMF; (xi) 80% y^^^^VJi,^ ^) LcIrn-OBt (DCC/HOBc). DIEA. DMF: (xv.) 80% 
PhSH; (xiv) Boclm.-OBt (DCC/HOBc). DIEA DMF: (xv) 80 * ™^M. M M m I y-dimChy.aminoJpropylJanUne or 3.3'-d.amino-W- 



propylethylamine (DIEA), W-dimetbylfonnam.de ( )MF), //•«•*■ 
vlpyrroliuone (NMP). DMSO/NMP. acetic anhydride (AcjO), and 
0 0002 M potassium cyanide/pyridine were purcliased from Appl.ed 
Biosystemf Boc-y-aminobutyric acid was from NOVA B.ochem, 
dichloromethane (DCM) and trietbylamine (TEA) were reagent grade 
from EM, thiophenol (PhSH). ((dimethylamiuo)pro P yl amme was horn 
Aldrich, trifluoroacetic acid (TFA) was from Halocarbon. phenol wa> 
from Fisher, and ninhydrin was from Pierce. All reagents were used 
without further purification. 

Ouik-Sep polypropylene disposable filters were purchased- from 
Isolab lac and were used for filtration of DCU. Disposable polypro- 
pylene filters were also used for washing resin for uinhydnn and p.cnc 
acid tests and for filtering predissolved amino acids into react.on vesse Is 
A shaker for manual solid-phase synthesis was obtained from St. John 
Associates. Inc. Screw-cap glass peptide synthesis reaction 
and 20 mL) with a K sintered glass frit were made as ; ^edby 
KeoL 14 'H NMR spectra were recorded on a General Electnc-QE MMK 

(14) Kent. S. B. H. Amu. Rev. Biochem. 1988, S7, 957. 



spectrometer at 300 MHz in DMSO-*. with chemical slufts reported 
i« parts per million relative to residual solvent. UV spectra were 
measured in water on a Hewlett-Packard Model 84S2A diode array 
spectrophotometer. Matrix-assisted, laser desorption/iomzalion tum- 
or flight mass spectrometry (MALDl-TOF) was performed at the Protein 
and Peptide Micmanalytical Facility at the California Inst.fuie of 
Technology. HPLC analysis was performed on either a ^ l QWM 
analytical HPLC or a Beckman Gold system using a RAJNEN C.s. 
Microsorb MV, S /im, 300 x 4.6 mm reversed phase column 10 0.1 A 
(wt/v) TFA with acetonitrile as eluent and a flow rate of 1.0 mUrnm. 
gradient elutio.i 1.25% acetonitrile/min. Preparatory «™*;t*"* 
HPLC was performed on a Beckman HPLC with a Waters DeltaPak 
25 x 100 mm, 100 /mi C18 column equipped with a guard, 0.1 A (wv 
v) TFA. 0.25% acetonitrile/min. Tlie 18MQ water was obtained from 
. a Millipore MilliQ water purification system, and all buffers were U.. 
filtered. . 
IiuIuiIui-y-PyPyPy-/3-Dp (1). Tlie product was synthesized by 
manual solid-phase protocols' and recovered as a white j ?»'"««• I; <•* 
mg . 4% recovery). UV ^ 312 (48 500); 'H NMR (DMS(W0 i 
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Fieure 4 MPE-FeOn footprinttng and affinity cleavage experiments 
o„ a 3'- J ?P-labeled 274 bp EcoKMPuuW restriction fragment from 
plasmid pSESl . The 5'-AGGGA(A )-3', 5'-AGGCA(A)-3', 5'-TGGGT- 
(C)-y S'-TGGGC(T)-3'. and 5'-AGGAA(A)-3' sites are shown on the 
right side of the autoradiogram. Lane I. intact DNA; lane 2. A reaction: 
lane 3. G reaction; lanes 4 and 11. MPE-Fe(U) standard; tone 5. 10 

M tA imimimfvy-PyPyPyPy-0-Dp; lan « 6 and 7 - 1° «** "*» '°° 

tatmlmPyy-PyPyPyPy^-Dp-EDTA-Fedl); lane 3. 10 Imlmkn- 
yTpyPyV-Dp; lanes 9 and 10. 100 nM and 1 /iM Imlmlm-y-PyPyPy. 
5.Dp-EDTA-Fe(H). All lanes contain 15 kcpm 3'-radiolabeled DNA. 
The control and MPE-FeflO !»•<« (I. 4. S. 8, and U) contain 25 mM 
Tris-acetate buffer (pH 7.0), 10 .nM NaCI, and lOOjiM/base pair calf 
thymus DNA. The affinity cleavage lanes (6. 7. 9. and 10) contain 25 
m M Tris-acetate buffer (pH 7.0). 200 .nM NaCI. and SO ^g/mL 
glycogen. 

10 09 (s I H), 9.S9 (s. I H). 9.88 (s. I H), 9.83 (s. 1 H). 9.57 (s, I H), 
■ 9 19 fbr s I H). 8.36 (t. 1 H. J = 5.6 Hz). 8.03 (m. 2 H). 7.64 (s. 1 H). 
7 1s IH) 7.45 (s. I H), 7.20 (d. I H. / = 1.0 Hz). 7.15 (d. 1 H, / 
= 2 0 Hz) 7 14 (,. I H). 7.08 (s. I H). 7.04 (,, 1 H). 6.87 (d, 2 H. J 
= 11 Hz)' 4 01 (s. 3 H). 3.99 (s, 3 H), 3.95 (s, 3 H). 3.82 («. 3 H). 
3 82 (s 3 H), 3.79 (s. 3 H). 3.37 (q. 2 H. / = 5.8 Hz). 3.26 (q. 2 H. J 
= 6 Hz) 3 10 (q. 2 H, / = 6.1 Hz), 2.99 (m, 2 H), 2.73 (d, 6 H, J 
= 48 Hz ' 2 34 MH.y= 7.2 Hz), 2.27 (t, 2 H, / = 7.3 Hz). 1.79 
(m 4 H)' MALDI-TOF-MS, 980.1 (980.1 calcd fbr M + H). 

ImlminiPy-y-PyPyPyPy-^-Dp (2). flie product was syutbesized 
bv manual solid-phase protocols' and recovered as a white powder (7.6 
L 1 1% recovery). UV X m 248 (42 000). 312 (48 500); 'H NMR 
(DMSO-4) 6 10.32 (i. 1 H), 10.13 (s. I H). 9.93 (s. I H). 9.90 (s, I 
H) 9 89 (s 1 H), 9.84 (s. 1 H). 9.59 (s. I H). 9.23 (or s, I H). 8.09 (t. 
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I H / = 5.3 Hz). 8.04 (m, 2 H). 7.65 (s. 1 H), 7.57 (s, I H). 7.46 (d. 

H J = 0.6 Hz .' 7.22 (m.,3 H), 7.16 (s. 2 H). 7.09 (d.lH.> 0 8 
Hz)' 7 06 (d, 2 H / = 1.1 Hz). 7.00 (d. 1 H. / = 1.7). 6.88 (d. I H. 
y 2" 18) 6 87 (d. IH.y-l-8 Hz). 4.02 (s, 3 H). 4.00 (s, 3 H). 3.99 
(s 3 H) ,3.34 (s. 3 H). 3.83 (s. 3 H). 3.83 (s. 3 H). 3.80 (s, 3 H). 3.79 
s H) 3-37 q 2 H /■ 6.2 Hz), 3.21 (q, 2 H, / - 6.4 Hz). 3.10 (q 
? H, " = 6.2 Hz). 3.00 («. 2 H), 2.73 (d, 6 H. J = 4.9 Hz). 2 34 ft 2 
H. J = 7.2 Hz). 2.28 (t, 2H.i = 7.0 Hz), 1.76 (m, 4 H): MALDl- 
TOFMS, 1225.9 (1224.3 calcd for M + H). 

IuiImlm-y-PyPyPy/J-Dp-NH, (1-NH0- A sample of machine- 
synthesized resin (350 rag. 0. 17 mmol/g") was placed in a 20 mL glass 
scintillation vial and treated with 2 mL of 3,3'^iarruno-,V-methyld.pro- 
pylamiiie at 55 °C for 1 3 b. The resin was removed by filtration through 
a disposable propylene filter, and the resulting solution was d,ssolved 
in water to a total volume of 3 mL and purified directly by preparatory 
reversed-phase HPLC to provide [mlml.n-y-PyPyPy^-Dp-NH; (28 mg 
4 1 % recovery) as a white powder. ' H NMR (DMSO-A) 6 10. 4 1 
H) 9.89 (s, I H). 9.88 (s. 1 H), 9.83 (s, I H). 9.6 (br s. I H). 9.59 (s. 

1 H) 8.36 t, I H, / = 5.5 Hz). 8.09 (t. 1 H. J = 5.0 Hz). 8.03 (t 1 H 
) = 0 Hz 7.9 (br s. 3 H), 7.63 (s. 1 H). 7.50 (s. I H). 7.44 s . H , 
7 19 (d. I H, J = 1-2 Hz); 7.13 (m. 2 H). 7.08 (d, 1 H, J = .3 Hz . 
7 02 4 1 H J = 1-2 Hz), 6.85 (m, 2 H). 4.01 (s. 3 H), 3.99 (_s, 3 H ) 
3 97 (m, 6 H). 3.S0 (s. 3 H). 3.77 (s, 3 H), 3.34 (q, 2 H, / = >.3 Hz), 
3 23 ! 2 H. J = 6.0 Hz). 3.05 (cn, 6 H). 2.83 (q, 2 H. J = 5.0 Hz), 

2 70 (d 3 H. / =• 4.0 Hz), 2.32 (t. 2 H. J = 6.9 Hz), 2.25 (t, 2 H i = 
6^ £). 1.90 (m, 2 H). 1.77 («. 4 H); MALDI-TOF-MS, 1022.8 

. (1023.1 calcd for M + H). 

luiiiniinPy-y-PyPypyPy^- 0 ! 5 ^ 111 P-^w- A san, p' e of "f 1 "" 6 - 

synthesized resin (350 mg. 0.16 mmol/g lJ ) was placed ia a 20 mL glass 
scintillation vial and treated with 2 mL of 3.3'-diarruDO-//-methyldipro- 
pylaraine at 55 "C for 18 h. The resin was removed by filtration through 
a disposable propylene filter, and the resulting solutioo was dissolved 
in water to a total volume of 8 mL and purified directly by P«Pfni'oty 
ceversed-pliase HPLC to provide IrrdmlmPy-y-PyPyPyPy-^Dp -NH 2 
f31 me 40% recovery) as a white powder. l H NMR (DMSO-rf.) 6 
0 37 % 1 H). 10.16 (s. 1 H). 9.95 (s. 1 H). 9.93 (s, 1 H). 9.91 (s. 1 
H). 9.86 (s. 1 H). 9.49 (br s, 1 H). 9.47 (s, 1 H) 8.12 (m. 3 H) 8.0 (br 
s 3 H), 7.65 (s. 1 H). 7.57 (s. 1 H), 7.46 (s. 1 H). 7.20 (m. 3 H). 7 16 
(m 2 H) 7.09 (d. 1 H. J = 1.5 Hz). 7.05 (m, 2 H). 7.00 (d, 1 H, / = 
6 Hz) . 6.38 (m, 2 H). 4.0 1 (s, 3 H). 3.99 (s, 3 H). 3.98 (s, 3 H). 3 3 
(s 3 H). 3.82 (s, 3 H). 3.81 (s, 3 H). 3.79 (s, 3 H), 3.78 (s. 3 H) 3.36 
q, 2 H J = 5 3 Hz). 3.21-3.05 (n, 8 H), 2.85 (q. 2 H J = 4.9 Hz). 
2 71 (d 3 H, 7 = 4.4 Hz), 2.34 (t, 2 H. y = 5.9 Hz). 2.26 (t, 2 H, J .= 
5^9 Hz). 1.85 (quintet, ./= 5.7 Hz), 1.72 (m, 4 H); MALDI-TOF-MS, 
1267.1 (1267.4 calcd for M + H). 

Uulml.o-r-PyPyPy-/»-Dp-EDTA (1-E). EDTA dianhydride (50 
ms) was dissolved in 1 mL of DMSO/NMP solution and I mL of DIEA 
by healing at 55 'C for 5 min. T1 ie dianhydride solution was added to 
Iniimlm.y.PyPyPy-0-Dp-NH: (1-NH.) (8.0 mg 7 pmol) and dissolved 
in 750 hL of DMSO. Tlie mixture was heated at 55 C tor mm, 
treated with 3 mL of 0.1 M NaOH, and heated at 55 °C for 10 mm. 
TFA (0 1%) was added to adjust the total volume to 8 mL and the 
solution was purified directly by preparatory HPLC ^rorralography 
to provide 1-E as a white powder (3.3 mg. 30% r<*°very)- 'H NMR 
(DMSO-<«<5 10.14(s. 1 H).9.90(s. 1 H), 9.39 (s, 1 H).9.85(s 1 ^H) 
9 58 (s 1 H). 9.3 (br s. I H), 8.40 (m, 2 H), 8.02 (m, 2 H), 7.65 (s, 1 
H). 7.51 (s, 1 H). 7.45 (s, I H). 7.20 (d. I H. J = 1.5 Hz). 7.1> (m, 2 
H 7.08 (d, 1 H, y = 1-1 Hz). 7.04 (d, IH,/ = 1.5 Hz), 6.86 (m. 2 
. H ! 4.00 (s, 3 H). 3.98 (s, 3 H). 3.94 (s. 3 H). 3.87 (m, 4 H), 3 82 (s, 
3 H) 3 81 (s. 3 H), 3.78 (s. 3 H), 3.72 (m. 4 H), 3.4-3.0 (m. 16 H), 
2.7 1 (d. 3 H. ./ =■ 4.2 Hz). 2.33 (t, 2 H. J = 5. 1 Hz). 2.25 (t 2 H, J = 
5.9 Hz). 1.75 (m. 6 H); MALDI-TOF-MS. 1298.4(1298.3 calcd forM 
+ H). ' * 

(15) Resin substitution has been corrected for die weight of the P°'y^' id = 
chain/The change in substitution during a specific coupling or for the entire 
synthesis can be calculated as Z,»w(tnmol/g) = W(t + v «r 
x lO" 1 ). where L ia the loading (ramol of amine per gram o ^res ^ ^ 
is the weight (g mol") of the growing polyam.de attached o «h : rcsm^ 
See' Bartos. K ; Chatzi, O.; Gatos, D.; Stravropoulos, G. Im. J. P'ptde 
Protein Res. 1991. J7. 513. 



Recognition of Core S'-GGG-l' Sequences 
EcoRI 




274 bp 
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PvuR 



A. Imlmlm-y-PyPyPy-P-Dp 



S ' -GGTGTCAT, 
3 * -CCACAGTAT' 




.TACGCGGACTjAGGCAlf 
ATGCGCCTGA/TCCGT ] 



icajaatgccgctg 
acggcgac 



TjTT^ 




AGTGCTCAGAjTGGGCITC AGCTTGGCGT - 3 • 
TCACGAGTCTlACCCgAGTCGAACCGCA-5 ' 



B. ImImIm-y-PyPyPy-|i-Dp-EDTA-Fe(II) 



5 • -C^TCTCATA^feOT^TACGCGGACTjAGGCAkATGCCGCTGAfTGGGTbA* 
3 * -CCACAGTATTtrCCCTTTATGCGCCTGA h'CCGTl lTACGGCGACT kcCCAh Ti 

C tmlmlin. y-PyPyPy- P-Dp-EDTA-Fe(li) 



.GTGCTC AGA rGGGC TC AGCTTGGCGT - 3 ' 
TCACGAGTCT tACCCQA GTCGAACCGCA" 5 • 

A 



5 * -GGTGTCATAAAGGG^TACGCGGACTAGGCAAATGCCGCTGATGGGTCAGTGCTCAGATGGGCTCAGCTTGGCGT-3 • 
3 ' -CCACAGTATTTCCCTTATGCGCCTGATCCGTTTACGGCGACTACCCAGTCACGAGTCTACCCGAGTCGAACCGCA-5 ' 



D. tmlmlmPy-y-PyPyPyPy-p-Dp 




5 ' -GGTGTCAT AA AGGGAA TACGCGGAC 
3 ' - C C AC AGT ATT jTC C CTTR TGCGCCTGj 




.GTGCTCAGAfrGGGCTfc AGCTTGGCGT - 3 ' 

'cacgagtctUcccgagtcgaaccgca-s ' 



E.. TmTmTmPy-Y-PyPyPyPy-p-Dp-EDTA-Ft(n) 



5 1 - GGTGTC AT AA AGGGAA TACGCGG ACT AGGCAAATGCCGCTGATGGGTCAGTGCTC AGA TGGGC^C AGCTTGGCGT -3 ' 
3 1 -CCACAGTATt ItcCCTtU tGCGCCTG OTCC^ • 
AffM A 



F. IinrmlmPy.y-PyPyPyPy.q-Dp-EDTA-Fear) 



5 " -GGTGTC AT AAAGGGAATACGCGG ACT AGGCAAATGCCGCTGATGGGTCAGTGCTCAGATGGGCTC AGCTTGGCGT -3 ' 
3 ' -CCACAGTATTTCCCTTATGCGCCTGATCCGTTTACGGCGACTACCCAGTCACGAGTCTACCCGAGTCCAACCGCA-5 ' 

Figure S. Results from MPE-Fe(II) footprinting of tmlmlm-y-PyPyPy-jS-Dp and [mlmlmPy-y-PyPyPyPy-^-Dp aud affinity cleavage of Imlmlm- 
y-PyPyPyi3-Dp-EDTA-Fe(I0 and tmJniJn\Py-y-PyPyPyPy-y?-Dp-EDTA ; Fe(lI). (Top) Illustration of the 274 bp restriction fragment with the position 
of the sequence indicated. Boxes represent equilibrium binding sites determined by the published model. Only sites that were quantitated by DNase 
I footprint titrations are boxed. (A and D): MPE'Fe(II) protection patterns for polyamides at 10 concentration. Bar heights are proportional to 
the relative protection from cleavage at each band. (B aud E): Affinity cleavage patterns of ImlmIm-y-PyPyPy-£-Dp-EDTA*Fe(lI) at I aod 
of ImlmlmPy-y-PyrVPyPy-^'Dp-EDTA'Fefll) at 100 nM, respectively. Arrow heights are proportioual to the relative cleavage intensities at each 
base pair. (C and F): Ball and stick binding models for the single orientation binding to formal match sequences by the six -ring and eight- ring 
EDTA-Fe(H) analogs, respectively. Shaded and uonshaded circles denote imidazole aud pyrrole carboxamides, respectively. Nonshaded diamonds 
represent the ^-alanine residue. Tlie boxed Fe denotes the EDTA*Fe(II) cleavage moiety. 

ImImIniPy-y-PyPyPyPy-/J-Dp-EDTA (2-E). Compound 2-E was H), 9.89 (s, I H), 9.84 (s, 1 H)> 9.57 (s, 1 H), 8.42 (m, 1 H). 8.03 (m, 
prepared as described for compound 1-E (yield 3.8 mg, 40%). 'H NMR 3 H), 7.64 (s, 1 H), 7.56 (s, I H), 7.44 (s, i H), 7.20 (m, 3 H), 7.15 (m, 
(DMSO-^) 6 10.34 (s, I H). 10.H (s, 1 H), 9.92 (s. I H), 9.90 (s, I 2 H)» 7.07 (d, [ H, / = 1.6 Hz), 7.05 (m, 2 H) t 6.99 (d, 1 H, / = 1.6 
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Table 1. Equilibrium Association Constants (M* 1 ^ 





match site 




end mismatch core mismatches 


polyamide 

Imlmim-y-PyPyPy-^-Dp 


5'-AGGGA-3' 
4.6 x 10* (0.3 J 


5>-TCCCT-3' 
7.6 x 10* (0.5) 


5'-TCCGC-3' 5'-AGCCA-3' 
1.3 x I0 A (0.3) 8.6 x 10* (0.4) 




match site 


cud mismatch 


core mismatches 


polyamide 

ImrmfniPy-y-PyPyPyPy-/?-Dp 


5'-AGGGAA-3' 
3.7 x 10* (0.3) 


5'-TOGGTC-3' 
\A x I0 7 (0.5) 


5'-TGGGCT-3' 5'-AGGCAA-3' 
1.7 x 10 6 (0.3) 2.9 x I0 6 (0.3) 



" Values reported are the mean values measured from a minimum of three DNase I footprint titration experiments, with the standard deviation 
for each data set indicated in parentheses. 4 The assays were performed at 22 °C at pH 7.0 in the presence of 10 mM Tris-HG, 10 mM KG, 10 
mM MgCh, and 5 mM CaCli. e Base pairs that arc in bold represent formal mismatches. 




"■* X <** '« S W . 

■** ft* ■»# ii» aw '-ij* 

, : **** ^ #r -># <& ■% t& «p li# '4 4* '* 

Figure 6. Quantitative DNase I footprint titration experiment with 
Cmlmim?y-y-PyPyPyPy-^-Dp on the EcoBd/Pt/uU restriction fragment 
from plasmid pSESl: lane 1, intact DNA; lane 2. A reaction: lane 3, 
G reaction; lane 4, DNase I standard; lanes 5-20, 20 pM, 50 pM, 100 
pM, 200 pM, 500 pM, 1 nM. 2 nM, 5 mM, 10 nM, 20 nM. 50 nM. 100 
nM. 200 nM, 500 nM, 1 jM, I fiM ImJmImPy-y-PyPyPyPy-/?.Dp. 
Tlie 5'-AGGGAA-3'. 5'-AGGCAA-3\ 5'-TGGGTC-3' ( and 5'-TGGGCT- 
3' sites that were analyzed are shown on the right side of the 
auto radiogram. All reactions contain 20 kepm restriction fragment, 10 
mM Tris-HG (pH 7.0), 10 mM KG, 10 mM MgG 2 , and 5 tnM CuG : . 

Hz). 6.87 (m. 2 H) ( 4.00 (s, 3 H). 3.98 (s, 3 H). 3.97 (s, 3 H), 3.83 (m, 
4 H), 3.82 (s. 6 H). 3.79 (s, 3 H), 3.78 (s. 6 H), 3.67 (m, 4 H), 3.4-3.0 
(m, 16 H), 2.71 (d, 3 H, / = 4.2 Hz), 2.34 (t, 2 H, / = 5.4 Hz), 2.25 
(I. 2 H,y=» 5.9 Hz), 1.72 (m, 6 H); MALDI-TOF-MS, 1542.2 (1542.6 
calcd for M + H). 

DNA Reagents and Materials. Enzymes were purchased from . 
Boehringer-Mannheim or New England Biolabs aud were used with 
their supplied buffers. Deoxyadenosine and thymidine 5'-[a- 32 P|- 
triphosphates and deoxyadeuosine 5'-[y- 3J P]triphospliate were obtained 
from Amersham. Purified water was obtained by filtering doubly- 
distilled water through the MilliQ filtration system from Millipore. 
Sonicated, deprotetnized calf thymus DNA was acquired from Phar- 
macia. All other reagents and materials were used as received. Ail 
DNA manipulations were performed according to standard protocols. 14 




[Imlmlm-Y-FyPyPy-p-Dp] (M) 




[ImImImPy-Y.-PyPyPyPy-P-Dp| (M) 
Figure 7. Data from the quantitative DNase I footprint titration 
experiments for the two potyamides, lmlmIm-y.-PyPyPy-0-Dp (top) 
and linimIiuPy-y-PyPyPyPy-/?-Dp (bottom), in complex with the 
designated sices. The 9 nofm points were obtained using photostimulabte 
storage phosphor autoradiography and processed as described in .the 
Experimental Section. The daca points for 5'-AGGGA(A)0'. 5'- 
TGGGT(C)-3', 5'-AGGCA(A)-3', and 5'-TGGGC(T)-3' sites are indi- 
cated by -filled circles (•), open squares (Q), filled inverted triangles 
(t). and open circles (O), respectively. The solid curves arc the best- 
fit Langmuir binding titration isotherms obtained from the nonlinear 
least-squares algorithm using eq 2. 

Construction of Plasmid DNA, The plasmid pSESl was con- 
structed by hybridization of the inserts, 5'-GATCCGGTGTCAT- 
AAAOGGAATACGCGG ACTAOGCAAATGCCGC- 
TGATGGGTCAGTGCTCAGATGOGCTC-3' and 5'-AGCTGAGC- 



(16) Sambroolc, J.; Fritsch, E. F.; Maniatis, T. Molecular Cloning; Cold 
Spring Harbor Laboratory: Cold Spring Harbor, NY, 1989. 



Recognition of Core 5'-GGG~y Sequences 
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A. Match Match 

5 ' -A G G G A-3' 5 ' -A G G G A A- 3 ' 

3'-T.C C C T-5* 3'-T C C C T T-5 ' 



B. 



5xl0 6 M* 1 



Match 



4xl0 8 M* 1 



End Mismatch 



5 ' -T G G G T-3 ' 5 ' -T G G G T 0-3 ' 

$ooo<y +x>ooo<y 

3 ' -A C C C A- 5 ' 3 ' - A C C C-A (gkS ' 



8 x 10 6 M* 1 



1x10' M 



C. End Mismatch 
5* -T G G G £}-3 ' 

3 * -A C C C EJ-S 1 



Core Mismatch 
5 ' -T G G G 0 T-3 ' 

3 *-A C C C|A-5' 



D. Core Mismatch 
5 ' -A G G [C] A-3 1 




2 x 10 6 M" 1 



Core Mismatch 
5 ' -A G G El A A-3 



3 ' -T C C EI T-5 1 
9 x 10 s M" 1 




3 *-T C C [5] T T-5 1 
3 x 10 6 M* 1 



Figure 8. Ball and stick models of ImImIm-y-PyPyPy-£-Dp (left) and 
tmJmImPy-y-PyPyPyPy-/?-Dp (righc) for each binding site, with the 
corresponding equilibrium association constants shown below each 
individual model. The binding sices shown are fa) 5'-AGGGA(A)-3', 
(b) 5'-TGGGT(C)-3', (c) 5'-TGGGC(TV3', and (d) 5'-AGGCA(A)-3'. 
Shaded and nonahaded circles denote imidazole and pyrrole carbox- 
amides, respectively. Nonshaded diamonds represent the ^-alanine 
residue. Formally mismatched base pairs are boxed. 

CCATCTGAGCACTGACCCATCAGCGGCATTTGCCTAGT- 
CCGCGTATTCCCTTTATGACACCG-3'. The hybridized insert was 
ligated into linearized pUCl9 BamHUHuidUl plasmid usiug T4 DNA 
ligase. The resultant constructs were used to transform Epicuriau Coli 
XL-2 Blue competent cells from Stratageue. Ampicillin -resistant white 
colonies were selected from 25 mL of Luria-Bertaui medium agar plates 
containing 50 /ig/mL ampicillin and treated with XGAL and IPTG 
solutions. Large-scale plasmid purification was performed with Qiagen 
Maxi purification kits. Dideoxy sequencing was used to verify the 
presence of the desired insert Concentration of the prepared plasmid 
was determined at 260 nm using the relationship of 1 00 unit = 50 
/ig/mL of duplex DNA. 

Preparation of 3'- and 5'-Eud-Labclcd Restriction Fragments. 
. The plasmid pSESl was linearized with EcoRl and then treated with 
either Klenow fragment, deoxyadeuosine 5'-(a- 35 PItriphosphate and 
thymidine 5'-[a- 5J P]triphospuate for 3' labeling, or calf alkaline - 
phosphatase and then 5' labeled with T4 polynucleotide kinase and 
deoxyadeuosine 5 '-[y- 33 P]tri phosphate. The labeled fragmeut (3' or 
5') was then digested with Pitttil and loaded onto a 5% uon -denaturing 
polyaorylamide gel. The desired 274 base pair baud was visualized 
by autoradiography and isolated. Chemical sequencing reactions were 
performed according to published methods. 17 

(VIPE*Fe(II) Footpruitlng. AU reactious were carried but in a 
volume of 40 ^L. A polyamide stock solution or water (for reference 
lanes) was added to an assay buffer where the final concentrations were 
25 mM Tris-acetate buffer (pH 7.0), 10 mM NaCl, 100 ^M/base pair 
calf thymus DNA, and 15 kepm 3'- or 5'-radiolabeled DNA. The 
solutions were allowed to equilibrate for 4 h. A fresh 50 /iM MPE* 

(1 7) {a) Iverson, B. L.; Dervan, P. B. Nucleic Acids Res. 1987, IS, 7823. 
fb) tMaxam, A. M.; Gilbert, W. S. Methods EnzymoL 1980, 65, 499, 



Fe(II) solution was made from 100 of a 100 /iM MPE solution and 
100^L of a 100 ferrous ammonium sulfate (FeCKH^SO^j^H^O) 
solutioa. After the 4 h equilibration, MPE*Fe(II) solution (5 /iM) was 
added, and the reactions were equilibrated for 5 min. Cleavage was- 
initiated by the addition of dithiothreitol (5 mM) and allowed to proceed 
for 14 min. Reactions were stopped by ethanol precipitation, resus- 
peudedin 100 mM tris-borate-EDT A/80% formamide loading buffer, 
denatured at 35 °C for 5 min, placed on ice, and immediately loaded 
onto an 8% denaturing polyacryl amide gel (5% crosslink, 7 M urea) at 
2000 V. 

Affinity Cleaving. All reactions were carried out in a volume of 
400 fiL. A polyamide stock solution or water (for reference lanes) 
was added to an assay buffer where the final concentrations were 25 
mM Tris-acetate buffer (pH 7.0), 200 mM NaCl, 50 .ug/mL of 
glycogen, and 15 kcpni 3'- or 5'-radiblabeled DNA. After the reactions 
were allowed to equilibrate for 4 h, ferrous ammonium sulfate (Fe- 
(NH 4 )2(.SOi)2'6H:0), 10 ,uM final concentration, was added After 
another 15 min, cleavage was initiated by the addition of dithiothreitol • 
(5 mM) and allowed to proceed for 12 min. Reactions were stopped 
by ethanol precipitation, resuspended in 100 mM tris-borate-EDT AJ 
80% formamide loading buffer, denatured at 85 °C for 5 min, placed 
on ice, and immediately loaded onto an 8% denaturing poly aery lamide 
gel (5% cross-link, 7 M urea) at 2000 V. 

DNase I Footprinting. AJ1 reactions were carried out in a volume 
of 40 ptL. We note explicitly that no carrier DNA was used in these 
reactions. A polyamide stock solutiou or water (for reference lanes) 
was added to an assay buffer where the final concentrations were 10 
mM Tris-HCI buffer (pH 7.0), 10 mM KCl, 10 mM MgCI 2 , 5 mM 
CaClj, and 20 kepm 3 '-radio labeled DNA. The solutions were allowed 
to equilibrate for a minimum of 4 h at 22 °C (the four-ring hairpin was 
allowed to equilibrate for up to 12 h with no noticeable affect on the 
data set). Cleavage was initiated by the addition of 4 piL of a DNase 
I stock solution (diluted with 1 mM DTT to give a stock concentration 
of 0,225 u/cnL) and was allowed to proceed for 5 min at 22 °C. The 
reactions were stopped by the addition of 3 M sodium acetate solution 
containing 50 mM EDTA and then ethanol precipitated. The cleavage • 
products were resuspended in 100 mM tris-borate-EDT A/80% for- 
mamide loading buffer, denatured at 85 °C for 5 min, placed on ice, 
and immediately loaded onto an 8% denaturing polyaciylamide gel (5% 
cross-link, 7 M urea) at 2000 V for I h. The gels were dried under 
vacuum at 80 °C, then quantitated using storage phosphor technology. 

Equilibrium association constauts were determined as previously 
described. 6 * 1 ' The data were analyzed by performing volume integra- 
tions of the 5'-AGGGA(A)-3', 5'-TGGGT(C)-3', 5'-TGGGC(T)-3\ aud 
5'-AGGGA(A)-3' sites aud a reference site. The apparent DNA target 
site saturation, 0 lfr , was calculated for each concentration of polyamide 
using the following equation: - 



where / t0 , and f„t are the integrated volumes of the target aud reference 
sites, respectively, aud f t9t ° and correspond to those values for a 
DNase I control lane to which no polyamide has been added. The 
([Ljto,, flapp) data points were fit to a Langmuir binding isotherm (eq 2, 
n =» I) by minimizing the difference between 6J an , aud 0s,, using the 
modified Hill equation: . 



where [L] w correspouds to the total polyamide concentration, K t 
corresponds to the appareut monomeric association constant, and $m* 
and represent the experimentally determined site saturation values 
when the site is unoccupied or saturated, respectively. Data were fit 
using a aonliuear least-squares fitting procedure of KaleidaGraph 
software (version 2.1, Abelbeck software) with K u 8^ and 0m». as 
the adjustable parameters. All acceptable fits had a correlation 
coefficient of R > 0.97. At least three sets of acceptable data were 
used in determining each association constant All lanes from each 
gel were used unless visual inspection revealed a data point to be 
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obviously flawed relative to neighboring points. The data were 
normalized using the following equation: 



a — 

— °PP min 

i a _ a 

°max °min 



(3) 



Quantitation by Storage Phosphor Tec ruiology Autoradiography. 
Photostimulable storage phosphorimagiug plates (Kodak Storage 
Phosphor Screen S0230 obtained from Molecular Dyuamics) were 
pressed flat against gel samples aud exposed in the dark at 22 °C For 
12-16 h. A Molecular Dynamics 400S Phosphortmager was used to 



obtain all data from the storage screens. The data were analyzed by 
performing volume integrations of all bands using the IraageQuant v. 
3.2. 
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Abstract: A series oftbur hairpin pyrrole -imidazole polyamides. unhnPy-y-PyPyPy-/9-Dp, PyPyPy-y-InilmPy-/?- 
Dp ; AcunhnPy-y-PyPyPy-/5-Dp, aad AoPyPyPy-y-LuImPy-^-Dp (tin = /V-inetliylinudazole-2-carboxatuide, Py = 
/V-methylpyrrole-2-carboxamide, Dp = iV^lunethy lam mo prop yla nude , y = y-aminobutyric acid, /? = /S-alanine, 
and Ac = acetyl), designed for recognition of 5'-(A,T)GG(A ? T)r'3' sequences in the minor groove of DNA were 
synthesized using solid phase methodology and analyzed with respect to DNA binding affinity and sequence specificity. 
Quantitative DNase [ footprint titration experiments reveal that the optimal polyamide ImlmPy-y-PyPyPy./S-Dp binds 
a designated 5'-TGGTT-3' match site with an equilibrium association constantof AT a = 1.0 x 10* M~ l and the single 
base pair mismatch sites., 5'-TGTTA-3' aad 5'-GGGTA-3', with 50- fold and 1 00-fold-Iower affinity, respectively 
(10 mM Tris*HCl. 10 mM K.C1.. 10 mM MgCI 2 , and 5 niM CaCI 2 , pH 7.0 and 22 °C). Poiyaniides of sequence 
composition AclmlmPy-y-P.yPyPy-^-Dp and AcPyPyPy-y-ImImPy-0-Dp.. which differ only by the position of the 
y-linker, bind with similar affinities and specificities. Recognition of sequences containing contiguous G-C base 
pairs expands the sequence repertoire available for targeting DNA with pyrrole- iinidazole polyamides. 



Introduction 

Pyrrole— imidazole polyamides offer a general method for 
the design of non-natural molecules for sequence-specific 
recognition in the minor groove of DNA. 1 ' 5 Within the 2:1 
polyamide— DNA model, an imidazole (Im) on one ligand 
opposite a pyrrolecarboxainide (Py) on the second ligand 
recognizes a G'C base pair, while a pyrrol ecarboxamide/ 
imidazole combination targets a C*G. base pair. 1,3 A pyrrole - 
carboxamide/pyrrolecarboxamide pair is partially degenerate for 
A-T or T*A base pairs. 1 " 3 On the basis of this model, the 
recognition of the sequences 5'-(A,T)G(A,T)C(A,T)-37 5'- 
(A,T)G(A,T)r37 (A,ThG(A,T>.-3', 4 and 5'-(A,T)GCGC(A.;T)- 
3' 5 lias been achieved. However, sequences containing con- 
tiguous G-C base. pairs are notably. absent from this list. 

Formation of a hairpin polyamide by covalently linking a 
polyamide heterodimer with a y-amino butyric acid (y) residue 
provides an approximate 300-fold enhancement in affinity over 
the unlinked polyamides, ImPyPy- and PyPyPy-. 6 Moreover, 
the specificity of the hairp in is greatly improved. The initial 
placement of the y- amino acid turn was chosen for synthetic 
ease a nd was not varied. With the. development of solid phase 

* Abstract published \\\ Advance ACS Abstracts, June 15, 1996. 

(1) "(a) Watlc, W. S.; Dervan, P. B. J. Am. Chem. Soc. 1987, J 09, 1574- 
1575. (b) Wade, W. S.; Mrksich, M.; Dcrvun, P. B. J. Am. Chcm. Soc. 
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M. 75S6. (d) Wade, W. S.; Mrksich, M.; Dervan, P. B. Biochemistry 1993, 
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(5) (at Geiersiamjer, B. H.; Mrksich, M.; Dervan, P. B.; Wenuner, D. E. 
Science 1994, 266, 646-650.- (b) Mrksich, M.; Dervan, P. B.;J. Am. Chem 
Soc. 1995, ///, 3325. 

(6) Mrksich, M Parks. M. E.; Dervan, P. B. J. Am. Chem Soc 1994 
JJ6, 7983. 



methodology for polyamide synthesis, we now assess the effect 
of varying the position of the y-turu monomer. 7 

In order to explore the recognition of 5'-(A ,T)GG(A ; T) r 3' 
sequences, a series of four head-to-tail linked Iiairpin polyamides 
containing neighboring imidazole rings! ImlmPy-y-PyPyPy-/?- 
Dp (1), PyPyPy-y-ImImPy-0-Dp (2), AchnImPy-y-PyPyPy-£- 
Dp (3), and AcPyPyPy-y-lmmiPy-£-Dp (4). were prepared 
using solid phase methods (Figures 1 and 2). 7 The polyamides 
are all synthesized with Boc-/?-alanine-Pam-resin, previously 
shown as optimal for polyamides. 8 Each imidazole is expected 
to form a specific hydrogen bond with -a guanine amino group 
allowing the recognition of contiguous G-C base pairs (Figure 
1). In addition, the linker turn position is varied within the 
uouacetylated and acetylated pairs of polyamides to determine 
the effect on the sequence specificity and binding affinity. We 
report here the binding specificity and affinity of the polyamides 
as determined by the complementary techniques, MPE-Fe" 
too (printing? and quantitative DNase I foo (printing. 10 MPE*- 
Fe 11 foo (printing verifies that sequence -specific recognition of 
the expected S'-TGGTT-3' target site has been achieved. In 
addition, DNase I quantitative footprint titration experiments 
reveal that the position of the y-linker does not dramatically 
affect either affinity or specificity of polyamides. especially the 
pair containing acetylated N -termini. 

Results 

Synthesis of Polyamides. The polyamides ImlmPy-y- 
PyPyPy-£-Dp (i), PyPyPy-y-lmImPy-£-Dp (2), AclmlmPy-y. 
PyPyPy-£-Dp (3), and AcPyPyPy-y.ImImPy-/?-Dp (4) were 

{7)Baird, E. E.; Dervan, P. 8. J. Am. Chem. Soc. 1996, tlS, 6141- 
6146. 

(8) Parks. M. E.; Bain], E. E.; Dervau, P. B. J. Am. Chem. Soc. 1996, 
JIM, 6147-6152. 

(9) (a) Van Dyke. M. W.; Dervan, P. B. Biochemistrv (983. 22, 2373. 
(bj Van Dyke, M. W.; Dervan. P. B. Nucleic Acids Res. 1983, //. 5555. 

(10) (a» Bienowiiz, M.; Senear, D. F.; Shea, M. A.; Ackers, G. tC. 
Methods Enzymot. 1986. 130, 132. (bl Breuowic, M.; Senear, D. F.; Shea, 
M. A.; Ackers. G. K. Proc. Natt. Acad. Sci. U.S.A. 1986, $3, 8462. (cl 
Senear, D. F.; 8renowitz t M.; Shea, M. A.; Ackers, G. K. Biochemistry 
1986, 2J, 7344. 
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ImImPy-7-PyPyPy-p-Dp • TGGTT 



5 ' -T G G T T-3 




3 ' -A C C A A-5 ' 




5- 3 ' 



PyPyPy-Y-ImlmPy-p-Dp • TGGTT 
5 ' -T G G T T-3 ' 




3 ' -A C C A A-5 ' 

Fiyure 1. Binding model for the complexes tunned between poly- 
amides lmImPy-j/-PyPyPy-0-Dp(l) (A) and PyPyPy-y-tinlmiV/i-Dp 
(2) (B) and a 5' -TGGTT -'3' sequence. Circles with dots represent tone 
pairs of N3 of purines and 02 of pyrimidtnes. Circles containing an K 
represent the N2 hydrogen of guanine. Putative hydrogen bonds. are 
illustrated by dotted lines. Ball and stick models are also shown. S hatted 
and nous haded circles denote imidazole and pyrrole carboxainides, 
respectively. Nonshaded diamonds represent a /^-alanine residue: 




ImlmPy-y-PyPyPy-p-Dp (1) 




PyPyPy-Y-ImlmPy-p-Dp (2) 




AcImLmPy-7-PyPyPy-P-Dp (3) 




AcPyPyPy-y-ImlmPy-p-Dp (4) 

Figure 2. Series of polyamides synthesized using solid phase 
methodology. 7 

prepared by solid phase methodology (Figure 2). Four unique 
pyrrole and imidazole building blocks were combined in a 
stepwise manner ou a solid support using Boc -chemistry 
protocols (Figure 3). For example, polyamide 2, PyPyPy-y- 
IinImPy-0-Dp, was prepared in 14 steps ou the resin, and then 
cleaved with a single-step aminolysis reaction (Figure 3). All 
polya nudes were found to be soluble up to at least I inM 
concentration in aqueous solution. 

Footprinting. MPE-Fe 11 tbotprinting on a 3'- or 5'- i2 P-end- 
labeled 266 base pair EcoKilPuuW restriction fragment from 
plasnudpMEPGG (25 mM Tris-acetate. 100/iM bp calf thymus 
DNA t 10 mM NaCl) reveals that the synthetic polyamides 1—4, 
at 10 /iM concentration, bind the designated target site 5'- 
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8uuNH, 



0 




Boc-P-Pmu Rcmii 



i, ii, in. iv, v, vi, vii, vim, ix, 
x, xi, xii, xiii, xiv 



I o 




PyPyPy-T-ImlmPy-P-Dp 



Figure 3. Solid phase synthetic scheme tor PyPyPy-HinImPy-0-Dp starting from commercially available Boc-0-Pnm-resm: (i) 80% TFA/DCM 
0.4 M PUSH: (ii) Boc-Py-OBt, DIEA, DMF: (iii) 80% TFA/DCM, 0.4 M PhSH; (iv) Boe-Im-OBt (DCC/HOBt), DIEA, DMF; (v) 30% TFA/DCm! 
0.$M PhSH;(vi) Boc-Im-OBl (DCC/HOBt), DIEA, DMF; (vii) 80% TFA/DCM, 0.4 M PhSH; (viii) Boo-y-aminobutyric acid(HBTU, DIEA)- (ix) 
8,0% TFA/DCM. 0.4 M PhSH; (x) Boc-Py-OBt, DIEA, DMF; (xi) 80% TFA/DCM , 0.4 M PhSH; (xii) Boc-Py-OBt, DIEA, DMF; (xiii) 80% 
TFA/DCM, 0.4 M PhSH; (xiv); pyrrole-2-carboxylie acid (HBTU/DIEA): (xv) dimethylamino)pivpyl amine, 55 °C (Inset) Pyrrole and imidazole 
monomers for synthesis of all compounds described here: pyrrole -2 -carboxy lie acid 5, imidazoIe-2 -carboxy lie acid 6, lb Boc-pyrrole-OBt ester 
7, 7 and Boc— imidazole acid 8. 7 



TGGTT-3' (Figures 4 and 5). In addition, sevei-al single base 
pair mismatch sites are bouud with lower aftinicy. Quantitative 
DNase 1 footprint titration experiments (10 mM Tris-HCI, 10 
mM tCCL 10 mM MgCl : . and 5 mM CaCl 2l pH 7.0 and 22 °C) 
were performed to determine the equilibrium association 
constants of the four polyamides 1-4 for a designated match 
site, 5'-TGGTT-3', as well as for two single base pair mismatch 
sites, 5'-TGTTA-3' and 5'-GGGTA-3' (Table l). m The poly- 
amide ImlmPy-y-PyPyPy-jff-Dp binds the target site 5'-TGGTT- 
3' with the highest affinity (association constant K 2 = 1.0 x 
10 s M~ l ) (Figures 6 and 7). The remaining polyamides have, 
lower but approximately equal association constants of A' u = 
—2 x 10 7 M~ l for the target site. The nonacetylated polyamides . 
in this series are > 50- to Id specific for the 5'-TGGTT-3' match 
site over either of the single base pair mismatch sites analyzed. 
The acetylated pair of polyamides exhibit lower sequence 
specificity for the analyzed sites. 

Discussion 

Each polyamide within this series specifically binds the five 
base pair designated target sequence 5'-TGGTT-3' ( as shown 
by MPE'Fe" footp tinting experiments, providing die first 
example of contiguous G-C recognition in the polyamide- DNA 
motif. Interestingly, the polya nudes prefer different mismatch 
sequences, indicating dial die position of the turn alters sequence 
selectivity, although only for die mismatches. 

Quantitative DNase I footprint titration experiments reveal 
tliat ImhnPy-y-PyPypy-/3-Dp (1) is optimal widiin this series 
of four polyamides.' This hairpin binds a 5'-TGGTT-3' match 
site widi an equilibrium association constant oi AT a = 1 x It)* 
M"* 1 , while the corresponding hairpin PyPyPy-y-[mlniPy-^-Dp 
(2). which differs only in the position of the y turn, shows lower 
affinity (£ a = -2x 10 7 M~ l ) for die 5'-TGGTT-3' site. Bodi 
unacetylated polyamides demonstrate good specificity (> 50- 



fold) for die target match site over the single base pair mismatch 
sites. The acetylated polyamides are similar in affinity to 
PyPyPy-y-lmImPy-0-Dp (2), but exhibit lower specificity.. 
AcmihnPy-y-PyPyPy-£-Dp (3) and AcPyPyPy-y-ImlmPy-jS- 
Dp. (4) are virtually indistinguishable from each other on the 
basis of affinity and specificity for the analyzed target sequeuces, 
indicating little preference for turn position. 

This series of contiguous imidazole-contaimng polyamides 
is remarkably similar in affinity and specificity to the single 
imidazole-containing hairpin polyamide, ImPyPy-y-PyPyPy-/?- 
. Dp, indicating .little or no energetic penalty, in this system for 
adjacent imidazoles. 11 Importantly, the position of die hairpin 
turn does uot significantly affect the recognition of the target 
5' -TGGTT-3' match site, although single base pair mismatch 
relative affinities are altered. 

Implications for the Design of Minor Groove Binding 
Molecules. The 2:1 motif has been used to specifically target 
.several sequences: 5'-TGTCA-3', 1 5'-TGTTA-3? 5'-AAGTT- 
3', 4 and 5'-TGCGCA-3'. s The results reported herein add 
sequences containing two contiguous G*C base pairs to the list, 
expanding the sequence repertoire for DNA recognition by 
polyamides. Furthermore, turn position showed niinimal effects 
on die specificity and affinity of the polyamides, indicating a 
new degree of flexibility within die 2:1 motif. The expansion 
of tlie polyamide sequence repertoire du-ough contiguous G*C 
recognition coupled widi solid phase syndietic advances allow- 
ing die rapid assembly and characterization of polyamides brings 
the goal of sequence -specific recognition of any DNA sequence 
by designed molecules closer to fruition. 

Experimental Section 

Materials. Boc-glycine-(4-carbonylaminomelhyl)-bei]zyI-ester-co- 
poly(styrene-divinylbenzene) resin ( Boc -G-Pam- resin) (0.2 mmol/g) 
0.2 mmol/g Boc-^-alanine-(4-carbonyIajTiinomethyl)-beiizyl-ester-co- 
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Figure 4. MP£*Fc" foo (printing experiment on a 3'- 32 P-labelcd 266 
bp EcoRl/Puull restriction fragment from plasmid pMEPGC. The 5'- 
TGGTT-3'. 5'-GGGTA-3' ( and 5'-TGTTA-3' sites are shown on die 
right side of the autoradiogram. All reactions contain 10 kepin restriction 
fragment. 25 mM Tris-acetate. 10 niM NaCI, 100/tM calf thymus DNA 
(bp), and 5 inM DTT. Lane I: A reaction. Lane 2: G reaction. Lanes 
3 and 8: MPE-Fe" standard. Lane 4: 10 /iM ImTmPy-y-PyPyPy-yS- 
Dp {i). Lane 5, 10 PyPyPy-HmImPy-0-Dp (2). Lane 6: 10 /iM 
AcImImPy-j/-PyPyPy-jtf-Op (3). Lane 7: 10/iM AcPyPyPy-y-rmlmPy- 
yS-Dp (4). Lane 9; intact DNA. 

poly(styrene-divinylbenzene) resin (Boc-/f-Pam-resin), dieyclohexyl- 
carbodiimide (DCC), hydroxy benzotriazole (HOBt), 2-(l//-ben2olriazol- 
l-yl)-LI ( 3,3-tetramethylui*ontum bexnfluorophosphate (HBTU), Boc- 
glycine, and Boc-#-alnnine were purchased from Peptides International, 
M/V-Diisopropylelhylamine (DIE A), X#-di methyl formamide (DMF), 
/V-methylpyrrolidone (NMP), DMSO/NMP, and acetic anhydride 
(AcjO) were purchased from Applied Biosystems. Boc -y -ainiuo butyric 
acid was from NOVA Biochem, dichloroinethane (DCM) ami triethy- 
lamine (TEA) were reagent grade from EM, thiopheuol (PhSH) and 
(dimethylamino)propylamine were from Aldrich, tritluoroacetic acid 
(TFA) was from Halocarbon. AJI reagents were used without farther 
purification. 

'H NMR spectra wew recorded in DMSO-f/$ on a GE 300 instrument 
operating at 300 MHz. Chemical shifts are reported in parts per million 
relative to the solvent residual signal. UV spectra were measured on 



;t Hewlett-Packard Model N452A diode a nay sped ru phi rtumeler. 
Matrix -assisted, laser desmpiiun. ionization time of flight mass spec- 
trometry (MALDl-TOF-MS) was carried oul al the Pruiem and Peptide 
Microanalylical Facility at the California Institute of Technology. 
I (PLC analysis was performed either on a HP 1090M analytical HPI.-C 
or on a Beckman Gold system .using a Rniniu f, s . Mictosorb MV, 5 
//in. 300 4.6 mm re versed -phase column in !».!% (w/v) TKA with 
aceiouilnle as eluent and a How rale of 1.0 mL-min. gradient el u I ion 
1.25% acelouilrile/min. Preparatory HPLC was carried out on a 
lieckman HPLC using a Waters DellaPak 25 x 100 .mm. (00 ;tm C, 3 
column equipped with a guard, 0. L% (w/v) TPA, 0.25% acetoiiitrile' 
min. \HMQ water was obtained from a Mi Hi pore MilliQ water 
purification system, and all buffers were 0.2 /im tillered. Reagent- 
grade chemicals were used unless otherwise stated. 

Activation of Boc-y-umiiiuhutyric Acid, Imidyzolt--2-carl)oxylic 
acitl, ami Pyrrole-2-carl>oxylie acid. The appropriate amino acid or 
acid (2 mmol) was dissolved in 2 mL of DMF. HBTU (720 mg, 1.9 
minol) was added followed by D1EA (I mL) and the solution lighliy 
shaken for al (east 5 min. 

Activation of Boc— Imidazole Acid. Boc-imidazole acid (257 mg, 
I mmol) and HOBt ( 1 35 mg, I in mo I) were dissolved in 2 mL of DMF, 
DCC (202 mg, I mmol) was (hen added, and the solution was allowed 
to stand for at least 5 min. 

Typical Manual Synthesis Protocol. PyPyPy-y-IudmPy-^-Dp. 
Boc-/j-Pam-resiu (1.25 g, 0.25 mmol of amine) was shaken in DMF 
for 30 min and drained. The N-Boc group was removed by washing 
with DCM for 2 ;c 30 s, followed by a 1 min shake in 80% TFA/ 
DCM/0.5 M PhSH, draining the reaction vessel, a brief -80% TFA/' 
DCM/0.5 M PhSH wash, and 20 min shaking in 80% TFA/ DCM/0.5 
M PhSH solution. The resin was washed for 1 min with DCM and 30 
s with DMF. A resin sample (8-10 mg) was taken for analysis. The 
resin was drained completely, Boc-pyrrole-OBt monomer (357 mg, I 
mmol) dissolved in 2 mL of DMF was added, followed by DIE A CI 
mL), and the resin was shaken vigorously to make a slurry. The 
coupling was allowed to proceed for 45 min. A resin sample (8-10 
mg) was taken after 40 min to check reaction progress. The reaction 
vessel was washed with DMF for 30 s and dichloro methane for I min 
to complete a single reaction cycle. Six additional cycles were 
performed, adding successively Boc-Im-OH (DCC/HOBt), Boc-lm-OH 
(DCC/HOBt), Boc-y-amiuobutyrie acid (HBTU/DIEA), Boc-Py-OBt, 
Boc-Py-OBt, and pyrrole-2-carboxylie acid (HBTU/DIEA). The resin 
was washed with DMF, DCM, MeOH, and ethyl ether and then dried 
in uacuo. PyPyPy-y-ImImPy-/J-Pam-resin (180 mg, 29 /imol) 12 was 
weighed into a glass scintillation vial, 1.5 mLof (/V,iV-diinethylamino)- 
propytamine added, and the mixture heated at 55 °C for 18 h. The 
resin was removed by filtration through a disposable polypropylene 
filter and washed with 5 mL of water, the amine solution and the water 
washes were combined, the solution was loaded on a Cia preparatory 
HPLC column. The polyamide was then edited in 100 min as a well- 
defined peak with a gradient of 0.25% acetonitrile/min. The polyamide 
was collected in four separate 8 mL fractions, and the purity of the 
individual fractious was verified by HPLC and l H NMR, to provide 
purified PyPyPy-y-ImImPy-/i-Dp (2) (11.2 mg, 39% recovery): UV 
A inH 246 (31 100), 312 (51 200); l H NMR (DMS(.W 6 ) 6 10.30 (s, 1 
H), 10.26 (s, 1 H), 9.88 (s, I H), 9.80 (s, I H), 9.30 (s, I H), 9.2 (br 
s, I H), 8.01 (m, 3 H), 7.82 (br s 1 H), 7.54 (s, 1 H), 7.52 (s, 1 H), 
7.20 (d, I H,y= 1.3 Hz), 7.18 (d, I H, J = 1.2 Hz), 7. 15 (d, 1 H, J 
= 1.3 Hz), 7.01 (d, I H, J = 1.4 Hz), 6.96 (d, I H. / = 1.4 Hz), 6.92 
(d, I H. J = 1.8 Hz), 6.89 (m, 2 H), 6.03 (I, I H, J « 2.4 Hz), 3.97 (s. 
3 H), 3.96 (s, 3 H), 3.85 (s, 3 H), 3.82 (s, 3 H), 3.78 (m, 6 H), 3.37 (in, 
2 H). 3.20 (q, 2 H, J » 5.7 Hz), 3.08 (q, 2 H J = 6.6 Hz), 2.94 (q, 2 
H ./= 5.3 Hz), 2.71 (d, 6 H/= 5.S Hz), 2.32(m, 4 H), 1.83 (in, 4 H); 
MALDI-TOF-MS 978.7 (979.1 calcd for M + H). 

IndmPy-y-PyPy Py-/?-Dp (1). Polyamide was prepared by machiue- 
assisted solid phase synthesis protocols, 7 and 900 mg of resin was 
cleaved and purified to provide 1 as a white powder (69 mg, 48% 
recovery): UV 246 (43 300), 308 (54 200); l H NMR (DMSOW*) 

(1 1) ICent. S. B. H. Arum. Rev. Biochem. 1988, 57, 957. 

(12; Resin substitution can be calculated as £ (tBW (n)mol/g) = ZW0 + 
~ ^oMi x I0~ J », where I is the loading (minul of amine/g of 
resin I. and W is the weight (gmol" 1 ) of Uie growing polyamide attached to 
the resin. See: Barlos, K.; Chatzi, 0.; Gatos, D.; Stravropoulos, G. InL J. 
Pept. Protein Res. 1991, 3 7, 513. 
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ImlmPy-Y-PyPyPy-P-Dp ( 10 \iM) 



S * -TCCCCAGCTCACCtl^GGTrrrcCGATACCGTGArrACCGTATGCCTlC^GT. 
1 ' - ACCCCTCG AGTCCC aft-CC A M ACCC TA TCGCA C TAA TCG C AT ACGC flCCT £ 



^CCATACGCACCTTGCCCTAATCATCGTCATACCTGTTTCC^ ' 
^GCTATCCGTCGAAC^CATTAGTACCAGTATCGACAAAGGACACACTTTAK^Tk-S * 



PyPyPy-r-lmimPy-P-Dp ( 10 ^ifvl) 



5 * -TCGCGAGCTCAGCGsjr^S^GATAGCGTGATTA^ . 
3 ' -AGCGCTCGAGTCGCTfegg^CGCTATCCCA^ 5 . 



AcimimPy-7-PyPyPy-p-Dp (10 u.M) 



5 * -TCGCGAGCTCAGCGAjTGC 
3 ' -AGCGCTCGAGTCGClkcCi 




ICGATAGCGTGATTAGCGTATGCGTpGGTJ*CGATACGCAGCTTCGCGTAATCATCGTCATAGCTGTTTCCTG 
,CG CTATCGC ACTAATCGCATAC GC Aj^ggAjjcGCTATG CGTCGAACCGCATTAGTACCAGTATCG AC AAAGG AC AC ACTTT 




AcPyPyPy-7-ImJmPy-P-Dp (1U JJ.M) 



5 ' -TCGCGAGCTCAGCG, 
3 • -AGCGCTCGAGTOGf 



CGAfrCX}TTjTGCGATAGCGTGATT^^ 

GCT^C^A^jACG CTATCGC ACTAATC GCATACGCiCCCAdcGCTATGCGTCG AACCGCATTAGT ACC AGTATCGAC AAACGAC ACACTTT. 



a Kcaata -s • 



Figure S. Histograms of cleavage protection (toot printing) data. (Top) Illustration of the 266 bp restriction fragment with the position of the 
sequence indicated. MPE-Fe" protection patterns for polyamides at 10 /*M concentration. Bar heights are proportional to the relative protection 
from cleavage at each band. Boxes represent equilibrium binding sites determined by the published model. 9 Only sites that were quantitated by 
DNase I footprint titrations are boxed. 



Table I. Equilibrium Association Constants (M~ 


1 JU.d 








match site 


sin 


gle mismatch sites 


polyaniidc 


5'-aTGGTTt-3' 


SVTGTTAt-3' 


5'-tGGGTAg-3' 


ImlinPy-y-PyPyPy-Zf-Dp 

PyPyPy-HmlmPy-0-Dp 
AcImImPy-y-PyPyPy-/?-Dp 
A ePy Py Py - j> -fin r in Py-/S- Dp 


1.0 x 10* (0.1) 
1.6 x 10 7 (0.2) 
1.3 x 10 7 (0.7) 
2.0 x 10 7 (0.3) 


1.7 x 10 6 (0.6) 
<I x 10 J 
. 1.6 x 10* (.1.1) 

1.3 x 10* (0.5) 


<l x 10 6 
<l x 10' 

1.3 x 10 6 (0.8) 
SI x 10* 



" Values reported arc the mean values measured from at least three footprint titration experiments, with the standard deviation for each data set 
indicated in parentheses. h The assays were performed at 22 °C at pH 7.0 in the presence of 10 inM Tris-HCl, 10 uiM KCl, 10 mM MaCl* and 5 
mMCaCI-.. 



d 10.31 (s, 1 H), 9.91 (s, 1 H), 9.90 (s, 1 H), 9.85 (s, 1 H). 9.75 (s, I 
H), ^.34 (br S( I H), 8.03 (in, 3 H). 7.56 (s, 1 H) t 7.46 (s, 1 H), 7.21 
(in, 2 H), 7.15 (m, 2 H), 7.07 (d, 1 H J = 1.2 Hz), 7.03 (d, I H, J « 
1.3 Hz), 6.98 (d,lH,/ = 1.2 Hz), 6.87 (m, 2 H). 4.02 (m, 6 H), 3.96 
(tn, 6 H), 3.87 (m t 6 H), 3.75 (q, 2 H, J = 4.9 Hz), 3.36 (q, 2 H, ,/ = 
4.0 Hz), 3.20 (q, 2 H, J - 4.7 Hz), 3.01 (q, 2 H, / = 5. 1 Hz), 2.71 (d, 
6 H, J = 4.8 Hz), 2.42 (m, 4 H), 1.80 (m, 4 H) MALDl-TOF-MS 
978.8 (979.1 calcd for M f H). 

AdmIiuPy-r-Pyl > yPy-/f-Dp (3). Polyamide was prepared by 
manual solid phase protocols and isolated as a white powder (8 mg, 
' 28% recovery): UV A tmi 246 (43 400), 312 (50 200); l H NMR (DMSO- 
d G ) 6 10.35 (s, I H), 10.30 (s, I H), 9.97 (s, 1 H), 9.90 (s, 1 H), 9.82 
(s, I H), 9.30 (s f t H), 9.2 (br s, IH), 8.02 (m, 3 H), 7.52 (s, 1 H), 7.48 
(s, I H), 7.21 (m, 2 H), 7.16 (d, 1 H,./= l.l Hz), 7.1 1 (d, 1 H, ./ = 
1.2 Hz), 7.04 (d, 1 H,7= 1.1 Hz), 6.97 (d, 1 H,/ = 1.3 Hz), 6.92 (d, 
1 H, J » 1.4 Hz), 6.87 (d, I H, J = 1.2 Hz), 3.99 ( s , 3 H), 3.97 (s, 3 
H), 3.83 (s, 3 H), 3.82 (s, 3 H), 3.80 (s, 3 H), 3.79 (s, 3 H), 3.47 (q, 2 
H, J = 4.7 Hz), 3.30 (q, 2 H t J = 4.6 Hz), 3.20 (q, 2 H, J - 5.0 Hz), 



3.05 (q,.2 H, J = 5.1 Hz), 2.75 (d, 6 H, J = 4.1 Hz), 2.27 (m, 4 H), 
2.03 (s t 3 H), 1.74 (m, 4 H); MALDl-TOF-MS 1036.4 (1036.1 calcd 
for M + H). 

AcPyPyPy-y-IiuImPy-/*-Dp (4). Polyamide was prepared by 
machine -assisted solid phase protocols 7 as a white powder (14 mg, 48% 
recovery): UV X inix 246 (44 400), 3 12 (52 300); l H NMR (DMSCW«) 
10.32 (s, I H), 10.28 (s, 1 H), 9.89 (m, 2 H), 9.82 (s, 1 H), 9.18 (s, 1 
H), 9.10 (br s, 1 H), 8.03 (m. 3 H), 7.55 (s, 1 H), 7.52 (s, I H), 7.21 
(d, 1 H,/= i.l Hz), 7.18 (d, I H../ = 7.16 Hz), 7.15 (d, I H,./= 1.0 
Hz), 7.12 (d, I H, J = 1.0 Hz), 7.02 (d, 1 H, J => 1.0 Hz), 6.92 (d. 1 
H, / = 1.1 Hz), 6.87 (d, 1 H. ./ = l.l Hz), 6.84 (d, 1 H, / =* 1.0 Hz), 
3.97 (s, 3 H), 3.93 (s, 3 H), 3.87 (s, 3 H), 3.80 (s, 3 H). 3.78 (in, 6 H), 
3.35 (q, 2 H. / = 5.6 Hz), 3.19 (q ( 2 H, J = 5.3 Hz), 3.08 (q, 2 H, J 
= 5.7 Hz), 2.87 (q, 2 H, J « 5.8 Hz), 2.71 (d, 6 H, ./ = 4.0 Hz), 2.33 
(m, 4 H), 1.99 (s, 3 H), 1.74 (m, 4 H): MALDl-TOF-MS 1036.2 (1036.1 
calcd for M +■ H). 

Construction of Plaainid DtNA. Using T4 DNA ligase, the plasmid 
pMEPGG was constructed by ligation of an insert, 5'-GATCGC- 
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Figure 6. Quantiracive DNase F footprint Titration experiment with 
ImlmPy-y-PyPyPy-^-Dp ( I) on the 3'- J2 P-labe!ed 266 base pair EcoRV 
Pvull restriction fragment from plasmid pMEPGG. Lane I: A reaction. 
Lane 2: G reaction. Lanes 3 and IS: DNase I standard. Lanes 4—17: 
50 pM, 100 pM, 200 pM. 500 pM, I tuM, 2 nM, 5 nM, 10 nM, 20 nM. 
50 nNf, 100 nM, 200 nM, 500 nM, I ,«M ImImPy-j/-PyPyPy-^-Dp 
(1), respectively. Lane 19: intact DNA. The 5'-TGGTT-3', 5'-GGGTA- 
3', and 5'-TGTTA-3' sites which were analyzed are shown on the right 
side of the autoradiogram. All reactions contain 10 kpm restriction 
fragment. 10 mM Tris-HCI, 10 mM KO, 10 inM MgC1 2) and 5 mM 
CaCl 2 . 

G AGCTCAGCG ATGGTTTGCG AT AGCGTG ATT AGC- 
GTATGOJTGGGTAGCGATACGC-3' and 5'-GCGTATCGCTAC - 
CCACGCATACGCTAAT-CACGCTATCGCAAACCATCGCTGA- 
GCTCGCGATC-3', into pUC 19 previously cleaved with BamHl and 
HihdlU. Ligation products were used to transform Epicurian Coli XL 
I Blue competent cells (Stratagene). Colonies were selected for 
a-complementation on 25 mL Luria-Bertani medium agar plates 
containing 50 ^g/mL ampicillin and treated with XGAL and 1PTG 
solutions. Large -scale p las mid purification w;\s performed with Qiagen 
purification kits. Plasmid DNA concentration was determined at 260 
inn using the relation I OD unit = 50 /tg/inL duplex DNA. The 
plasmid was linearized with EcoRl, followed by treatment with either 
Klenow, deoxyadenosine 5^(a-' 2 P]triphosphate (Amersham), and thy- 
midine 5'-(a-- 12 P]lriphosphate for 3' labeling or calf alkaline phosphatase 
and subsequent 5' end labeling with T4 polynucleotide kinase and 
y-[ j -p]dATP. The 3'- or 5 '-end-labeled fragment was then digested 
with Pvtdl and isolated by noudenaluring gel electrophoresis. Tlie 3'- 
or 5'- 3 'P-end-labeled 266 base pair EcoRVPutdl restriction fragment 
was used in all experiments described here. Chemical sequencing 
reactions were performed according to published protocols. 13 Standard 
protocols were used for all DNA manipulations. 14 

Identification of Binding Sites by MPE'Fe 11 Footpriiiting. All 
reactions were carried out in a total volume of 40 with final 




[Polyamide] 



Figure 7. Data for the quantitative DNa^e I footprint titration 
experiments for the four polyamides 1-4 in complex with me 
designated 5'-TGGTT-3' site. The 0 nv , nt points were obtained using 
photos timu I able storage phosphor autoradiography and processed as 
described in the Experimental Section. The data points for IntfmPy- 
y .p y P y Py./J-Dp (3), PyPyPy-y-imImPy-£-Dp (2), AcImlmPy-j'-Py, 
PyPy-jfJ-Dp (3), and AcPyPyPy-}>-lmImPy-/*-Dp (4) are indicated by 
O, ■, and O, respectively. The solid curves are die best- lit Langmuir 
binding titration isotherms obtained from a nonlinear least squares 
algorithm using eq 2. 

concentrations of species as indicated in parentheses. The I i gauds were 
added to solutions of radiolabeled restriction fragment (10 000 cpm), 
calf thymus DNA (100 /xM bp), Tris-acetate (25 mM, pH 7.0), and 
NaCI (10 mM) and incubated for I h at 22 °C. A 50 MPE-Fe" 
solution was prepared by mixing 100 t uL of a 100 MPE solution 
with a freshly prepared 100 fitM ferrous ammonium sulfate solution. 
Footpriiiting reactions were initiated by the addition of MPE-Fe" (5 
( «M), followed 5 min later by the addition of dithiothreitol (5 mM), 
and allowed to proceed for 15 min at 22 °C. Reactions were stopped 
by ethanol precipitation, resuspended in 100 inM tris-borate-EDTA/ 
80% fonnamide loading buffer, and electrophoresed on 8% polyacryl- 
. amide denaturing gels (5% cross -link, 7 M urea) at 2000 V for I h. 
The gels were analyzed using storage phosphor technology. 

Anulysis of Energetics by Quantitative DNase I Footprint 
Titration. All reactions were executed in a total volume of 40 ^L 
with final concentrations of each species as indicated. The ligands, 
ranging from 50 pM to 1 ,«M, were added to solutions of radiolabeled 
restriction fragment (10 000 cpm), Tris-HCI (10 mM, pH 7.0), KCi 
(10 mM), MgCli (10 inM), and CaCl 2 (5 mM) and incubated for 4 h 
at 22 °C. Footpriiiting reactions were initiated by the addition of 4/jL 
of a stock solution of DNase I (0.025 unit/mL) containing 1 inM 
dithiothreitol and allowed to proceed for 6 min at 22 °C. The reaction 
mixtures were stopped by addition of a 3 M sodium acetate solution 
containing 50 mM EDTA and ethanol precipitated. The reactions were 
resuspended in 100 mM tris-borate-EDT A/80% fonnamide loading 
butter and electrophoresed on 8% polyaeiylamide denaturing gels (5% 
cross-link, 7 M urea) at 2000 V for 1 h. Jhe footprint titration gels 
were dried and quantitated using storage: phosphor technology. 

Equilibrium association constants were detennined as previously 
described.*- 10 The data were analyzed by performing volume integra- 
tions of the i'-TGGTTo', 5'-TGTTA-3', and 5'-GGGTA-3' sites and 
a reference site. Binding sites are assumed to be independent and 
noninteracting as they are separated by at least one tiill turn of the 
double helix. The apparent DNA target site saturation, 0 app , was 
calculated for each concentration of polyamide using the following 
equation: 

( 13) (a) rvcrsou. B. L.; Dervau, P. B. Nucleic Acids Res, 1987, IS, 7823- 
7830. [b) Maxain, A. M.; Gilbert, W. S. Malunh Enzvmoi l«)«0 f 6S. 499- ■ 
560. 

(14i Sambrook, J.; Fritsch, E. F.; Maniatis, T. Molecular Cloning; Cold 
Spring Harbor Laboratory: Cold Spring Harbor, NY, 1989. 
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where /m and / K , are the integrated volumes oi l he target and reference 
sites, respectively, and /„/ and / rc i 0 correspond to those values tor a 
DNuse I control lane to which no polyainide lias I ven added. The 
(M*|i.-. 'A, v ) data points were tilled to a l.angmuir him liny isotherm 
(et| 2. u ~ 1) by minimizing the difference between 0 3{y and f) titt using 
I lie modified ((ill equation: 



determining each association constant. All lanes from each gel were 
used unless visual inspection revealed a data point to be obviously 
flawed relative to neighboring points.- The data were normalized using 
the following equation: 



w i hit ■ % iti*ii 



(S) 



(2) 



where [LJ M curresponds to the total polyainide concentration. K. % 
corresponds to the association constant, and 0 nin and 0, iax represent 
the experimentally determined, site saturation values when the site is 
unoccupied or saturated, respectively. The concentration of DNA used 
for quantitative footprint titrations is <50 pM, which just i ties the 
assumption that free ligand concentration is approximately equal to 
total ligand concentration. 1 " Data were tilted using a nonlinear least 
squares fitting procedure of tCnJeidaGmph software (version 2.1, 
Abelbeck software) running on a Power Macintosh 6 1 00/60 A V 
computer with 0 tim> and 0 niin as the adjustable parameters. The 
goodness -of- tit of the binding curve to the data points is evaluated by 
the correlation coefficient, with R > 0.97 as the criterion for an 
acceptable fit. At least three sets of acceptable data were used in 



Quantitation by Storage Phosphor Technology Autoradiography, 
Photostimulable storage phosphoriinnging plates (Kodak Storage 
Phosphor Screen S0230 obtained from Molecular Dynamics) were 
pressed Hal against gel samples and exposed in the dark at 22 °C for' 
12- 16 h. A Molecular Dynamics 400S Phosphor Imager was used to 
obtain all data from the storage screens. The data were analyzed by 
performing volume integrations of all bands using the ImageQiuuit 
version 3.2 software running on an AST Premium 386733 computer. 
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Abstract: The sequence -specific recognition of the minor groove of DNA by pyrrole -inuclazule polyamides has 
been extended to 9- 13 base pairs (bp). Four polyamides, ImPyPy-Py-PyPyPy-Dp : ImPyPy-G-PyPyPy-Dp. ImPyPy- 
0-PyPyPy-Dp, and ImPyPy-y-PyPyPy-Dp (lin = /V-mediylimidazole, Py = rV-uwthylpyrroie, Dp = NJf- 
dimethylamiuopropylamide, G = glycine, /3 = .£-alanine, and y - y -amino butyric acid), were synthesized and 
characterized with respect to their DNA-binding affinities and specificities at sequences of composition 5'-(A.T)G- 
(A J T) 5 C(A,T)-3 / (9 bp) and 5'-(A.T):G(A,T)C(A.T) 5 -3' (13 bp). In both sequence contexts, the ^-alanine -linked 
compound lmPyPy-£-PyPyPy-Dp lias the highest binding affinity of the tour polyamides, binding the 9 bp site 
S\TGTTAAACA-3' (K a = 8 x 1:0* M" 1 ) and tlxe 13 bp site 5'-AAAAAGACAAAAA-3' ({Q = 5 x 10 9 M" 1 ) with 
affinities higher than the formally /v^ me thy Ipynole- linked polyanude trnPyPy-Py-PyPyPy-Dp by factors of -8 and 
-85, respectively (10 mM Tris-HCl, 10 mM KCl, 10 mM MgCl 2) and 5 inM CaCl 2l .pH 7.0). The binding data for 
ImPyPy-y-PyPyPy-Dp, which has been shown previously to bind DNA in a "hairpin" conformation, indicates that 
y-aminobutyric acid does not effectively link polyaniide subunits in an extended conformation. These results expand 
the binding site size targetable with pyrrole- umdazole polyamides and provide structural elements that will facilitate 
the design of new polyamides targeted to other DNA sequences. 



Introduction 



The development of 2:1 pyrrole— imidazole polyamide— DNA 
complexes provides a new model for the design of molecules 
for sequence -specific recognition in the minor groove of DNA. 
The polyamide IniPyPy-Dp was shown to specifically bind the 
mixed A,T/G,C sequence 5'-(A,T)G(A,T)C(A,T)-3' as a side- 
by-side autiparaliel dimer. 1 In this complex, each polyamide 
makes specific contacts with one strand ou the floor of the minor 
groove such that the sequence specificity depends ou the 
sequence of side -by-side amino acid pairings. A side-by-side 
pairing of imidazole opposite pyn*ole recognizes G*C base pairs, 
while a side -by-side pairing of pyrrole opposite imidazole 
recognizes C*G base pairs. 1 A side -by-side pyrrole— pyrrole 
pairing is partially degenerate and targets both A-T and T-A 
base pairs. 1 ' 2 The generality of the 2:1 model has been 
demonstrated by targeting other sequences of mixed A,T/G,C 
composition. 3 ~ 5 ImPyPy-Dp and distamycin (PyPyPy) bind 
simultaneously to a 5VA f T)G(A.T)»-3' site as an antiparallel 

* Absrrncc published in Advance ACS Abstracts. June IS, 1996. 

(1) (aj Wade, W. S.: Mrkaich, M.: Decvun, P. B. J. Am. Chem. Soc 
1992, 114, S733. (b) Mrlcsich, M.; Wade. W. S.; Dwycr, T. J.; Ccifstanyer. 
B. H.; Wcnimcr, D. E.; Dcrvun, P. 8. Proc. Nutl. Acad. Set U.S.A. Wl[ 
<S'0. 7586. (ci Wade. W. S.; Mrksieh, M.; Dervan, P. B. Biochemistry 1993* 
J2. 11385. 

(2) fai Pellon, J. C; Wemmer, 0. E. Proc. Natt. Acad. Sci. U.S.A. 19S9, 
Sti t S723. (b) Pcllon, J. 0.; Wcnimcr, D. E. J. Am. Chem. Sttc. 1.990, It 2, 
1 393. [c) Ciicn. X.; Ramakrishnan, B.; Rao, S. T.; Suiuluralingom. t VI. Nature 
Struct. Biol. 1994, /, 169. 

f3) fai Mrlcsich. M.; Dervan. P. B. / Am. Chem. Soc. 1993, /7.\ 2572- 
2576. (b) Gcicrsrangcr, B. H.; Jaoobscn, J. -P.; Mrtcsich, M.; Dervan, P. B.; 
Wemmer, 0. E. Biochemistry 1994, 13, 3055. 

H) Gciemtauger, B. H.: Dwycr. T. 1:. Batluni. Y.; Lown, J. W.; Wcnimcr, 
D. E. J. Am. Chem. Sac. 1993, US, 4474. 

(5) (a) Gcicatangcr, B. H.; Mi-ksich, M.; Don/an. P. B.; Wcnimcr, D. E. 
Science 1994. 266, 646. (bi Mricsich, M.; Dervan, P. B. J. Am Chem Sue 
199S, 117, 3325. 




Figure .1. Ribbon models of (a) 9 bp "overlapped" and (b) 13 bp 
"slipped" 2: 1 polyanude -DNA complexes. 

heterodimer. 3 A PylmPy polyamide aiui distamycin bind 5'- 
(A.T) 2 G(A,Th-3' 4 and ImPylmPy-Dp targets 5'-( A,T)GCGC- 
. (A.T)-3V ( 

The binding affinity and sequence specificity of a ooncovalent 
antiparallel hoinodimeric or heterodimeric polyamide— DNA' 
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a) 



b) 





/ 



Oh • 5'-TGTTAAACA-3' 



(3) 2 \5*-AAAAAGACAAAAA-3' 



S'-TGTTAAAC A-3 

•OOoOOO 
OOOoOO« 

3 ' **A C A A T T T G T-5 1 



5 1 -A AAAAGACAAAA A-3 1 

•OOoOOO 
KXXK>OOf 

3 ' -T T TTTCAGTTT T T-5 ' 



Figure 2. (Top) Complexes of lmPyPy-0-PyPyPy-Dp with the targeted sites (a) 5'-TGTTAAACAO' (9 bp "overlapped'V and- fb) 5'- 
AAAAAGACAAAAA-3' M3 bp, "slipped"). Circles with dors represent lone pairs on N3 of purines and 02 of pyrimidines. Circles containing an 
•H represent the N2 hydrogen of guanine. Putative hydrogen bonds are illustrated by dashed titles. (Bottom) Complexes oftmPyPy-X-PvPyPy-Dp 
where X = Py. G. and /J, with (a) S'-TGTTAAAC A-3' and (b) S'-AAAAACACAAAAA-i'. The shaded and light circles represent imidazole and 
pyrrole nngs. respectively, and the diamond represents the internal amino acid X. The specifically targeted guanines are highlighted. 

exhibited a binding isotherm in quantitative footprinting experi- 
ments consistent with formation of an intramolecular "hairpin" 
complex in which the polyamide folds back on itself. 7 Modeling 
suggested diat die glycine- and 0-alamne-liiiked polyamides do 
not favorably bind as "hairpins" in the minor groove of DNA. 
Moreover, these polyamides exhibited cooperative binding 
isodienus in quantitative footprinting experiments, consistent 
with two polyamides binding in extended conformations as 
irttennoleculur if utters. 1 * It appears that the glycine- and 
/?-a la nine -linked polyamides disfavor binding in the hairpin 
conformation and prefer to bind in an extended conformation. 7 ** 
In a formal sense, there are multiple extended binding motifs 
(and hence multiple binding site sequences) for polyamides of 
sequence composition un(Py) r -Dp, as discussed below. 

"Overlapped" and "Slipped" Binding Modes. We report 
here die DNA-binding affinities of four polyamides having die 



complex can be increased by covalently linking die two 
polyamidesA 7 The DNA -binding properties of die polyamides 
ImPyPy-C-PyPyPy-Dp, ImPyPy-£-PyPypy-Dp, and tmPyPy- 
y-PyPyPy-Dp, in which die terminal carboxyl group of ImPyPy 
and the terminal amine of PyPyPy-Dp are connected with 
glycine (G) t ^-alanine (/?), and y -amino butyric acid (y), 
respectively, were recently reported. 7,8 The y -amino butyric 
acid- linked polyamide bound the designated target site 5'- 
TGTTA-3' widi high affinity and sequence specificity and 

(6) (a) Mrksich, M.; Dervun, P. B. J. Am. Chcm. Stic. 1993,' /1J, 9892- 
9899. tb'j Dwycr, T. J.; Gcierstanjjcr. 8. H.: Mricsich. M.; Dcrvan, P. B.; 
Wemmer, D. £. ./. Am. Own. Sue. 1993, US, 9900. .(c) Mrksich. M.; 
Dervan, P. 8. J. An. Chem. Site. 199-1, 1/6. Wij. (d) Singh, M. P ; Ptouvie/ 
B.; Hill, G. C; Gueck, J.; Pom 0. T.: Lowu, J. W. J. Ant. Chem Soc' 
1 ( J94, U6, 2006. 

(7) Mrlcsich. M.; Parks. M. Dervan. P, 8. J. Am. Chem. Sue. 1994 
116. 7983. 

1 8) Mrksieh, M. Ph~D. Thesis. California Institute ot'Teehnology, 1994. 
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general sequence ImlMV-XTylMv-Dp. where X = Py. Ci.fi. 
in- y. to I ho 9 bp she >'-TUTf AAA(.'A-'J' and to the I '3 bp sites 
5 ' - A A A A AG A ( A A A A A -V and 5'-ATATAGACATATA-3'. 
The polyamidcs having internal A / -mediylpyrroU.\ glycine, and 
/^-alanine residues were anticipated tu bind the 9 and 13 bp 
sites in un extended cnulbrmaLion.. It was not clear at the uuisol 
it' the y-iuniiuibuiyric acid-linked }K»lyamide ImPyPy-y-PyPyPy- 
l.")p would bind exclusively in an extended or "hairpin" 
euutbrmation to the targeted sites. For ImPyPy-X-PyPyPy-Dp 
polyamidcs binding in an extended conformation, the polya- 
mtde-DNA complexes expected to form at the 9 bp and 13 bp 
target sites represent two distinct binding modes.' which we refer 
to as '.'overlapped" and "slipped", respectively. In the "over- 
lapped" <y bp) binding mode, two ImPyPy-X-PyPyPy-Dp 
polyamides bind directly opposite one another (Figures la and 
2a). The "slipped" (13 bp) binding mode integrates die- 2:1 
and 1:1 polyamide— DNA binding motils at a single site. In 
tliis binding mode, die ImPyPy moieties of two IniPyPy-X- 
PyPyPy-Dp polyamides bind die central 5'-AGACA-3' sequence 
iii a 2:1 manner as in the IniPyPy homodimer, 1 and the PyPyPy 
moieties of the polyamides bind the all-A,T tlanlcing sequences 
as in the 1 : 1 complexes of distamyc in (Figures I b and 2b). The 
structure of the complex formed by die polyamide ImPyPy-G- 
PyPyPy-Dp with a 13 bp target site lias been characterized by - 
2D NMR. 9 

In the 9 bp "overlapped" and 13 bp "slipped" binding sites 
described above., the G*C and C-G base pairs are separated by 
one and five A.T base pairs, respectively. While we have 
concentrated here on these sites, we note tiiat "partially slipped" 
sites of 10, 1 1. and 12 bp in which die G-C and C*G base pairs 
are separated by two ; three, and four A,T base pairs, respec- 
tively, are also potential binding sites of the polyamides studied 
here. 

Studies of the energetics of distamyc in binding have shown 
that while the binding affinities are similar for complexation to 
poly[d(A-T)]-poly(d(A-T)] and poly[d(A)]*poiy[d(T)], the ori- 
gins of these binding affinities are different. 10 Binding to die 
alternating copolymer is enthalpy driven, while binding to die 
homopolymer is entropy driven. 10 However, not all 5 bp sites 
(A,T) 5 are bound widi equal affinity. By quantitative footprint- 
ing experiments, the distamycin analog Ac-PyPyPy-Dp (Ac = 
acetyl) was shown to bind the sites 5' -A AT A A -3' and 5'- 
TTAAT-3' with 2-fold and 14-fold lower affinity, respectively, 
relative to the site 5'-AAAAA-3'. lc On the basis of this result, ' 
we anticipated dial the 13 bp site 5'-AAAAAGACAAAAA-3' 
may be bound with higher affinity -than the 13 bp site 
5'-ATATAGACATATA-3'. 

The binding affinities of polyamides 1-4 (Figure 3) for the • 
three targeted sites 5'-TGTTAAACA-3'", 5'-AAAAAGA- 
CAAAAA-3', and 5'-ATATAGACATATA-3' were detenhined 
by quantitative DNase I footprint titration experiments. 

Results 

Synthesis of Polyamides. The polyamides IniPyPy-Py- 
PyPyPy-Dp (l). 11 ImPyPy-G-PyPyPy-Dp (2), 7 ImPyPy-^-Py- 
PyPy-Dp (3).. 7 and ImPyPy-y-PyPyPy-Dp (4) 7 were prepared 
as described previously. 



(9) Geierscanger, 0. H.: Mrksieh, M.; Dervan. P. B.; Wemuier, D. E 
Nature Struct. Bin!. J. 321. 

(KM Breslaticf. K. Remefa, 0. P.; Chou, W.-Y.; Ferrante. R.; 'Curry. 
J.; Zaunczkowski. 0.; Snyder, J. G.; Marky, L. A. Proc. Nut/. Acad Sci 
U.S.A. 1987, 84, S922. (b) Marky, L. A.: Brestauer. fC. J. Proc. Natl Acad 
Sci. U.S^l. 1987, ,W, 4359. (cl Marky, L. A.; Kupkc, K. J. Binchemixtrv 
1989. 2#, 99X2. 

(Ill Kelly, J. J.; Baird, E. E.; Dei van, p. 8.. proc. Nud. Acud. Sci. U.S.A.. 
in press. 




Figure 3. Structures of die four polyamides ImPyPy-X-PvPyPy-Dp, 
where X = /V-inethylpyrrole (Py), glycine (G), ^alanine and 
y-aminobutyric acid (y). 



Footprint! ng. Quantitative DNase I tbotpruiting 1 :a J on the 
3'- i2 P-labeled 281 bp pJT3 A/lWFspl restriction fragment 
(Figures 4 and 5) (10 mM Tris-HCl, 10 mM KCi 10 mM 
MgCl 2 , 5 mM CaCl 2j pH 7.0. 22 °C) reveals that., of the four 
polyamides ImPyPy-X-PyPyPy-Dp, three (X = Py, G, p) bind 
to both the 9 bp "overlapped" site 5'-TGTTAAACA-3' and die 
' 13 bp "slipped" site 5'-AAAAAGACAAAAA-3' with equilib- 
. rium association constants (A' a ) greater than 5 x 10 7 M" 1 (Table 
1) and display cooperative binding isodierms (eq 2, // = 2) at 
these sites consistent with binding as intennolecxtlar (timers 
(Figure 6). The fact that the polyamides ImPyPy-G-PyPyPy- 
Dp and ImPyPy-/?-PyPyPy-Dp bind in die 9 bp "overlapped" 
binding mode indicates that the internal glycine and £-alauine 
amino acids are accommodated opposite a second iigand in a 
2:1 polyamide— DNA complex. 

The polyamide ImPyPy -y-PyPyPy-Dp binds the site. 5'- 
TGTTAAACA-3' with an equilibrium association constant K A 
1 x ,10* M" 1 and also binds the site S'-AAAAAGACAAAAA- 
3' with lower affinity. (A" 3 = 6x 10 6 M~ l ). This compound 
displays binding isotherms (eq 2, w = 1) at these sites (Figure 

6) , consistent with binding as an intramolecular hairpin to the 
5 bp "matched" sites 5'-TGTTA-3' and 5'-AAACA-3' (Figure 

7) and to the 5 bp "single base pair mismatch" site 5'-AGACA- 
3'. 7 _ Significantly, it appears from these results that ImPyPy- 
y-PyPyPy-Dp does not effectively link polyamide subunits in 
an extended conformation. 

Binding Affinities. Comparison of die binding aftLuties ot 
die four polyamides ImPyPy-X-PyPyPy-Dp, where X = Py, G, 
ft, and y, reveals that the internal amino acid X lias a dramatic 
.effect on complex stabdities (Table I, Figure 6). The formally 
rV-mediylpynxde-linked polyamide ImPyPy-Py-PyPyPy-Dp binds 

{ID ta) Galas, O.; Sen mil*, A. M/c/e/c Acittt to. 1978, i, 3157. (b) 
Fox, K. R.; Wuriiig. M. J; Nucttrie Acitix Rex. 1984. /i, 9271: 

tU;(aj Bieaowii2, M.; Senear, D. F.; Slica. M. A.; Ackers. G. K_ 
MttrluHlx Enzymoi U0 t 132. (b| Btenowirz, M.; Senear, D. F.; Shea, 
M. A.; Ackers. G. K. Pruc. Nad. Aatd. Sci. U.S.A. 1986, S3, 8462. (c) 
Senear. D. F.; 8retiowitz. M.; Slica. M. A.; Ackers. G. K.. Qiochemisuy 
1986, 25, 7344. 
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5 ' -GCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCT^ 
, 3 ' -CGTTGACAACCCTTCCCGCTAGCCAC^^ 

CCAGGGTTTTCCCACTtJAaSACG'TO^ 

GGTCCC AA ^GGGTC AGTGCTGC AAC A TTTTG CTG CCGGTC AC rTAAGCTCG AGCCA 'JTiGtiCl.: I "t "I 'ISCA'PCGC AUViGC C ACGTTTTTCTGTTTTT CCG 

TCGACGCCGCATATACACATATAGGGCCXTCGACGCTGTTAAACAGGCTCGACGCCAGCTCGTCCTAGCTAGCGTCGTAGCC - 3 ■ 

AGCTCCCCCC T ATATCTGTAT ATCC CG G C AGC TG CG ACAATYTCT CCGAGCTGCGGTCG AGCAGG ATCGATCGCAGC ATCGCAGAATT - 5 * 

Fiyure 4. Sequence of the 2SI bp pJT3 .•j/fll/T-A-/;! restriction fraymem. The three binding nitos rhat were Lniaiy^ccl in quantitative footprint titration 
experiments are indicated. 

Table 1. Association Constants (M ') for Poly-amides ImPyPy-X-PyPyPy-Up, Where X = *V-Mcdiylpyn/ole ( >. Glycine (G), />'- Alanine UJl 
mid y-Amitiobulyrie Acitlfj'T" 



polyamide 

binctilt ^ *' ,(c hnPyPy-Py-PyPyPy-Dp fmPyPy-Q-PyPyPy-Dp rniPyPv-ff.PyPyPy. Dp hnPyPy-y-PyPyPy-Dp 

5'-TGTT A A A C A - j 9.7 (±2.3) x 10 7 1.4 (±0.1) x 10* 7.S (db0.fi) x 10? 1.4 (±0 3) x If) 3 

5'-AAAAAGACAAAAA-3' 5.4 (±i.5) x 10 7 1. 1 (±0.1) x' 10 3 > 4.7 (±0.7) x 10" 6 4 (±0 6) x 10 6 

5'- ATATAG AC ATATA-3 ' 3.6 (±0.5) x IP 7 M> (±0.4) x 10" ■ 1.0 (±0.1) x 10 M 4.6 (±0.5) x 10" 

" Tlie reported association consuuHs are die mean values obtained from three DNase T footprint titration experiments. The standard deviation tor 
each value is indicated in parentheses. * The assays were carried out at 22 °C at pK 7.0'iti the presence of 10 mM Tris-HCI, 10 mM KCI 10 miU 
"MgCh, and 5 rnM CaCU. 



5'-TGTTAAACA-3' and 5'-AAAAAGAC'AAAAA-3' with equi- 
librium association constants A' a 3 1 x 10 s M~' and 5 x U) 7 
M" 1 , respectively. The glycine -linked poiyaniide fmPyPy-G- 
PyPyPy-Dp binds both 5'-TGTTAAACA-3' and 5'-AAAAA- 
GACAAAAA-3' witli affinities similar (equal and 2 -fold higher, 
respectively) to those of ImPyPy-Py-PyPyPy-Dp. In contrast, 
the /^-alanine linked polyauiide lruPyPy-/?-PyPyPy-Dp binds 5'- 
. TGTTAAACA-3' and 5'-AAAAAGACAAAAA-3' with affini- 
ties higher than those of ImPyPy-Py-PyPyPy-Dp by factors of 
approximately 8 (A' a = 3 x 10* M" 1 ) and 85 (K z = 5 x MP 
M" 1 ), respectively. Relative to ImPyPy-Py-PyPyPy-Dp, the 
hairpin- forming polyamide ImPyPy-y-PyPyPy-Dp binds 5'- 
TGTTAAACA-3' (a matched hairpin binding site) and 5'. 
AAAAAGACAAAAA-3' (a mismatched hairpin binding site) 
with equal aud 8- fold- lower affinities, respectively. 

Specificity for 5'- AA A A A G A C A A AAA -3 ' versus 5'- 
ATATAGACATATA-3'. Comparison of the binding affinities 
of the four polyamides at the 13 bp sites 5'-AAAAAGA- 
C A AAA A -3' and 5'-ATATAGACATATA-3' indicates that the 
specificity between the sites depends on die internal amino acid 
(Table I). ImPyPy-G-PyPyPy-Dp and ImPyPy-jff-PyPyPy-Dp 
are approximately 20-fold and > 5-fold specific respectively tor 
5'-AAAAAGACAAAAA-3' versus 5'-ATATAGACATATA- 
3'. The polyamides ImPyPy-Py-PyPyPy-Dp and ImPyPy-y- 
PyPypy-Dp bind 5'-AAAAAGACAAAAA-3' and 5'-ATATA- 
GAGATATA-3' with similar affinities. 

Discussion 

Implications for the Design of Minor Groove Binding 
Molecules. The results presented here reveal that £ -alanine is 
an optimal linker for joining two three-ring subuuits in an 
extended conformation, providing a use lid structural motif for 
the design of new polyamides targeted to sites longer thau S 
bp. Recently, it has been shown that as the length of a 
polyauiide having the general sequence lm(Py') r -Dp increases 
beyond live rings (co ires ponding to a 7 bp binding site), the 
binding affinity ceases to increase with increasing polyauiide 
length, indicating that the AZ-methylimidazole and /V-methylpyr- 
role residues fail to maintain an appropriate base-pail* register 
across the entire length of the polyamide— DNA complex. 11 The 
higher binding affinities observed here for (mPyPy-^-PyPyPy- 
Dp relative to ImPyPy-Py-PyPyPy-Dp indicate that the flexible 
^-alanine linker relieves the register mismatch, allowing both 
tiueo-riug subuuits to bind optimally. Notably, higher binding 
affinities are observed for ImPyPy-/?-PyPyPy-Dp versus im- 



PyPy-Py-PyPyPy-Qp despite die higher conformational entropy 
and lower aromatic surface area of the /^alatihie -linked polya- 
uiide. The observation here that ^-alanine effectively links 
polyauiide subuuits within 2:1 polyamide-DNA complexes is 
consistent with the previously reported finding that /^-alanine 
effectively links polyamide subunits within 1:1 polyamide- 
DNA complexes. 14 

From the standpoint of binding specificity, die observation 
here diat a single compound can bind in multiple binding modes 
is problematic. The next generation of pyrrole— imidazole 
polyamides targeted to binding sites greater than S bp in length 
should incorporate constraints that specify a single binding 
mode. 

The previously described y-aniinobutyric acid-based "hairpin" 
motif 7 complements the /7-a la nine-based "extended" motit 
.described here. For ImPyPy-y-PyPyPy-Dp, the binding iso- 
therms and affinities observed here are consistent with the 
previous report that y-aminobutyric acid effectively links 
polyamide subunits in a "hairpin" cou forma don 7 and indicate 
tliat y-aminobutyric acid does uot effectively link polyamide 
subunits in an extended conformation. Significantly, these 
observations indicate that (1) extended binding modes will not 
compromise the sequence specificity of Iiairom-fomiing, y-ann— 
uobutyric acid-linked polyamides, and (2) ^-alanine and y-anu- 
uobutyric acid linkers could be used within a single polyamide 
with predictable results. 

The results reported here expand the binding site size 
targetable widi pyrrole— imidazole polyamides and provide 
struc rural motifs that will facilitate the design of new pyrrole— 
imidazole polyamides targeted to other sequences. 

. Experimental Section 

Materials. Restriction endo nucleases were purchased from either 
New England Biolabs or Boe ringer- Mannheim and used according to 
the manufacturer's protocol. Escherichia colt XL- 1 Blue competent 
cells were obtained from Stralageiie. Sequeoase (version 2.0) was 
obtained from United States Biochemical, and DNase ( was obtained 
from Pharmacia. [a- 52 P]Thyinidine 5' -triphosphate (>3000 Ci/minol) 
and [a- i: Pjdeoxy adenosine 5' -triphosphate (>6000 CVmmol) were 
purchased from DuPoni NEK 

Const ruction of P las mid DNA. Plastnid pJT3 was prepared by 
standard methods. Briefly, plasmid pJTl was prepared by hybridization 
of two complementary sets of synthetic oligonucleotides: 5'-CCCG- 

(141 (a) Youugquist. R. S.; Dervan. r\ 8. J. Am. Chem, Sac. 1987, 109. 
7564. (b) Grifttu. J. H.: Dervan. P. 8. Unpublished results. 
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Figure 5. Storage phosphor auinradiograms of 8% denaturing polyncrylainide gets used to separate the fragments generated by DNase I digestion 
in quantitative footprint titration experiments; Lanes I and 2, A and G sequencing lanes. Lanes 3 and 21: DNnse I digestion products obtained in 
the absence of polyninide. Lanes 4-20: DNase I digestion products obtained in the presence ot'0. 1 nM (0.0 1 nM). 0.2 nM (0.02 iiM), 0.5 nM (0.05 
nM), 1 nM (O.i nM), 1.5 nM (0.15 nM). 2.5 uM (0.25 nM), 4 n M (0.4 nM), 6.5 nM (0.65 nM), 10 nM (I nM), 15 aM (1.5 nM), 25 nM (2.5 nM), 
40 nM (4 nM). 65 nM (6.5 nM). 100 nM {10 nM), 200 nM (20 nM), 500 nM (50 nM), 1 ^M (0.J /(M) (concentrations used tor polyamide 
rmPyPy-^-PyPyPy-Dp only are in parentheses). Lnne 22: intact DNA. The targeted binding sites are indicated on the right side of the autoradiograms. 
All reactions contain 15 kepm restriction fragment, 10 mM Tris-HCl, 10 inM JCCl, 10 mM MgCli, and 5 mM CaCU. 
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Figure 6. Data obtained from quantitative DNase I footprint titration 
experiments showing die effect of* the internal amino acid X on binding 
of the polyamides unPyPy-X-PyPyPy-Dp to the sites (a) S'-TGT- 
TAAACA-;/ and (b) 5'-AAAAAGACXaaaA-3', where X = G (□). 
0 (•). Y (O), and Py (a). The (&„ Mmt [L] btt ) data points were obtained 
as described in the Experimental Section. Each data point is the average 
-value obtained from three quantitative footprint titration experiments. 

S'-TGTTAAAC A-3 ' 




3 ' -A C A A T T T G T-S ' 

# 

S'-TGTTAAAC A-3 ' 



3'-ACAATTTG T-S \ 

Figure 7. Model for the complex funned by ImPyPy-y-PyPyPy-Dp 
with 5 ' -TGTT A A A C A - j ' (5 bp, "hairpin"). 

GAAC'cTAGCGTACCGGTGCAAAAAGACAAAAAGG- 
CTCGA-j' and 5'-GGCGTCGAGCCTTITTGTCTTTTTGCACC- 
GGTACGCTACGTTC-3'. and 5'-CGCCGCATATAGACATATAG- 
GGCCCAGCTCGTCCTAGCTAGCGTCGTAGCGTCTTAAG-3' and 
5'.TCGACTTAAGACGCTACGACGCTAGCTAGGACGAGCT. 
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duplexes were phosphurylaled with ATI* ami T4 polynucleotide kinase 
ami ligaletl lt> I he large pl/Cl 1 -) Aiwi/Sali restriction fragment using 
'1*4 DNA lipase. £ colt XL- 1 Blue competent cells were then 
transuirmetl with lite lignfed plasmid. and plasmid DNA from ampi- 
L-illiu-rcsislanl while colonies isolated using a IVomega Maxi-Prep kit. 
Plasmid pj'l'.i was prepared in a similar manner, except thai 'the 
following synthetic oligonucleotides were hybridized and cloned into 
the targe Apu\iSal[ fragment ofpJTl: 5 ' •( iTl '( J AOG (.TGT'l "A A A - 
(.' A U G C T </ G A C G C C A G C* T ( ' Ci T (. ' C T A G C T A G C G T - 
OGTAGCG lXTfAAGAG-r and 5'-^CGAt.r^rr|■AAGACGCr^AC , - 
G A C G C T A G C T A G G A C G AGCTGGCGTCG AG CCT- 
GTHAACAGCGTCGACGGCC-j'. The presence of the. desired i use it 
was determined hy restriction analysis and dideoxy set|itencing. 
Plasm id DNA concentration was determined at 260 nm using the 
. relation I 00 unit = .SO .ug/tnL duplex DNA. 

Preparation of -'-'P- End-Labeled Restriction Fragments. Plasmid 
pH 1 was digested with AJlU t labeled at the j'-eml using Sequenase 
(version 2.0), and digested with Fspi The 2SI bp restriction fragment 
was isolated by uondenaluring gel electrophoresis and used in all 
quantitative footprinting experiments described here. Chemical se- 
quencing renctious were performed as described. 1116 Standard tech- 
niques were employed for DNA manipulations. 17 

Quantitative DtVa.se I Footprint Titration. 1 - 13 All reactions were 
executed in a total volume of 40 /iL. We note explicitly that no carrier 
DNA was used in these reactions. A polyamide stock solution or H 2 0 
(for reference lanes) was added to an assay buffer containing radio- 
labeled restriction fragment (15 000 cpm), affording final solution 
conditions of 10 mM Tris-HCl, 10 mM KG, 10 inM MgCh, 5 miyf 
CaCli, pH 7.0, and either (i) 0.1 nM to 1 ;tM polyamide, for all 
polyamides except ImPyPy-/i-PyPyrY-Dp, (ii) 0.01 nM to 0.1 ( «M 
polyamide for lmPyPy-/5-PyPyPy-Dp, or (iii) no polyamide (for 
reference laues). The solutions were allowed to equilibrate for 5 h at 
22 °C. Footprinting reactions were initiated by the addition of 4 
of a DNase 1 stock solution (at the appropriate concentration to give 
-55% intact DNA) containing 1 mM dithtothreitol and allowed to 
proceed for 7 miu at 22 °C. The reactions were stopped by the addition 
of 10 /(L of a solution containing 1.25 M NaCI, 100 mM EDT.A, and 
0.2 mg/mL glycogen, and ethanol precipitated The reaction mixtures 
were resuspended in IX TBE/80% fonnamide loading buffer, denatured 
by heating at 85 °C for 10 inin, and placed on ice. The reaction 
products were separated by electrophoresis on an S% poly aery lainide 
gel (5% cross-link, 7 M urea) in IX TBE at 2000 V. Gels were dried 
and exposed to a storage phosphor screen (Molecular Dynamics). 

Quantitation and Data Analysis. Data from the footprint titration 
gels were obtained using a Molecular Dynamics 400S Phosphorlmager 
followed by quantitation using ImageQuaiU software (Molecular 
Dynamics). Background-corrected volume integration of rectangles 
encompassing the footprint sites and a reference site at which DNase 
I reactivity was invariant across the titration generated values for the 
site intensities (/«;«) and the reference intensity (/ wt ). The apparent 
fractional occupancies {9 JfP ) of the sites were calculated using the 
equation 

•w * / Q/f o U.) 
Uite iJ tcf 

where and /° r j-are the site and reference intensities, respectively, 
from a control lane to which no polyamide was added. 

Tlie ((L] M . data points were fitted co a general Hill equation 
(eq 2) by minimi zing the difference bclwceu 0 W and tf fl ,: 

0* = «U + <»™ - K ^}f „ (2) 

where (L] u - is the total polyamide concentration. AT, is die apparent 
association constant, and ^ and 6 twit arc the experimentally derer- 

(lit (versou, B. L.; Dervau. P. B. Nucleic Adds lies, 1987, J$ 7823- 

(IM Maxajn, A. M.; Gilbert. W. S. Methods Etwmd. 10SI1, rfj, 
560. 

(1 7t Sambrook. J.; Fritsch. E. F.; Maniatis. T. Molecular Cloning; Cold 
Spring Harbor Laboratory: Cold Spring Harbor. W. 19S9. 
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mmcii site sal unit ion values when I he silo is unoccupied or saluraled. 
respectively. I lie data were filled us»ng a nonlinear leasi squares fillint 
procedure with K,„ 0 iaMi and 0 lmn as rhe adjustable parameters, ami with 
either // = 2 or // = I depending on which value oi // gave the better 
lit. Wc ntXe- explicitly Mini Irealruenl of the tlala in rhis manner dues 
nol represent an alleiupl lo nun lei a binding mechanism. RaUicr. we 
have chosen lo compare values of the nssucinliun constant, a parameter 
thai represents the concent nil ion of polynmide at which the binding 
site is half-saturated. The binding i sol I icons were normalized using 
the following equal ton: 

"(UKtlt tt ft 

Three sets of data were used in determining each association constant. 
The method tor determining association constants used here involves 
' the assumption that [L] k< * [L] (rcc . where [LJ (i « is the concentration of 



pnlyamide free in solidum (unbound). Tor very high association 
constants this assumption becomes invalid, resulting in underestimated 
association constants, tn the experiments described here, the ONA 
concentration is esiimalcd lo be -'50 pM. As a consequence, measured 
assiiciarion constant;; of I a 10'' M"' and 5 x l(T M ' underestimate 
ihe true association constant;; by factors of approximately <\.$ and 
I . j — 2, respectively. 

Acknowledgment. We iue graLcl'ul to the National Institutes 
uf Health (Grant GM-276SI) for research support, 10 the 
National Science Foundation for a predoctoral fellowship lo 
J.W.T.. and to the Howard Hughes Medical Institute for a 
ptcdoccorui fellowship to E.E.B. 

JA96.07.26O f 



E3 RESEARCH ARTICLE^™ 



Hin Recombinase Bound to DNA: 
The Origin of Specificity in Major 
and Minor Groove Interactions 

Jin-An Feng, Reid C. Johnson, Richard E. Dickerson* 

The structure of the 52-amino acid DNA-binding domain of the prokaryotic Hin recom- 
binase, compiexed with a DNA recombination half-site, has been solved by x-ray crys- 
taJIography at 2.3 angstrom resolution. The Hin domain consists of a three-nx-helix bundle, 
with the carboxyMerminaJ helix Inserted into the major groove of DMA, and two flanking 
extended polypeptide chains that contact bases in the minor groove. The overall structure 
displays features resembling both a prototypical bacterial heiix-turn-helix and the eukary- 
otic homeodomain, and in many respects is an intermediate between these two DNA- 
binding motifs. In addition, a new structural motif is seen: the sw-arninb add carboxyl- 
terminaJ peptide of the Hin domain runs along the minor groove at the edge of the 
recombination site, with the peptide backbone facing the floor of the groove and side chains 
extending away toward the exterior. The x-ray structure provides an almost complete 
explanation for DNA mutant binding studies in the Hin system and for DNA specificity 
observed in the Hin-related family of DNA invertases. 



The Hin recombinase catalyzes a DNA 
inversion reaction in the Salmonella chro- 
mosome (I, 2). This site-specific recombi- 
nation reaction controLs the alternate ex- 
pression of two flagellin genes by reversibly 
switching the orientation of a promoter. 
During the process of inverting the l-lcb 
segment of DNA, Hin proteins bind to left 
and right recombination sites (hixL and 
hixR, respectively) located at the bound- 
aries of the inyertible DNA segment- The 
hixL and hbdt sites with their bound Hin 
protein then form a synaptic complex with 
a third cu-accing site, a re combinational 
enhancer, which itself is bound by two 
dimen of the 98-amino acid Fis protein. 
Formation of this invertasome complex (3- 
5) aligns the two reconciliation sites cor- 
rectly and activates the Hin protein to 
initiate the exchange of DNA strands, lead- 
ing to inversion of the intervening DNA- 
Hin belongs to a family of bacterial DNA 
invertases or rccombinascs that includes Gin 
from phage Mu r Cin from phage PI, and Pin 
from the cl4 prophage of Escherichia coli. In 
addition to sharing 66 to 80 percent se- 
quence identity between pain of sequences, 
this family of proteins can substitute func- 
tionally for one another in each biological 
system (1). These DNA invertases most 
likely constitute an evolutionary family not 
unlike the c-cype cytochromes. The avail- 
ability of DNA sequence information from 
the binding sites of all four systems makes 
the present study of Hin-DNA binding es- 

Tho autnors are wrtrt the Motecuta/ Blotogy Institute 
and Department of Biologies* Cfterristry, University of 
California, Las Angeles. CA 90024. 

•To whom correspondenc* should bs addressed at 
the Molecular Btoiogy Institute. 
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peciaUy informative in elucidating principles 
of protein-DNA recognition. 

Hin binds to each hixL and hixR recom- 



bination site as a dimer. The final 52 amino 
acids of the two chains (Fig. I A) are in- 
volved in binding to a 26-bp recombination 
site (Fig. IB) built from two 12-bp impcr-. 
■ feet inverted repeats separaced by a 2-bp 
core region where DNA strand exchange 
occurs (6-8). The amino- ccrminal 138- 
amino acid "catalytic" domain is positioned 
in part over the core nucleotides. Although 
the monomeric 52-amino acid peptide by 
itself can bind to a recombination half-site 
with low-to-moderate affinity (dissociation 
constant ** 10~ 7 )» cooperative interac- 
tions between the amines crrrtinal domains 
of two intact Hin molecules are required for 
high-affinity binding {K d ~ KT 9 ) (&-I0). 

A large body of footprinting, mutation, 
and chemical derivacization data has indi- 
cated features of Hin-DNA interaction 
which are distinctive to prokaryotic DNA- 
binding proteins (8-13). Specific binding 
requires both major groove interactions in- 
volving a helix -turn- helix (HTH) a-helix 
motif and minor groove interactions in- 
volving the sequence Gly U9 -Arg l ' w -Pro Mi - 
Arg M2 . If residues 139 and 140 are deleted 
from the cajboxyl-ceiminal domain, for ex- 
ample, then sequence-specific binding tc 
DNA is abolished. As Fig. 1A shows, these 



A Amino ArH Sflnuaneg* nt nNA-BirwUnn DflmflfnS Of EntBflC InVBrtasea 



Hh ORPRAITKH . EQEQISW-LEX . OHP . RQOLAIIF .GIG . VBTLTRY7 . PASS I KXHMN 
Gh ORPFXLTKA . EQEQAGRLLAQ . OI P . JUCQVALXT . DVA . LflTLTKKH . PAXRAH I ENDDRH* 
CH 0RRP1CYQE£ . TQQQKKRLLZK .017. XKOVAIXT . DVA . VSTLTXXT . PASSFQS 
PVi ORRPKLTPE . QQAQAGHI*! AA . OTP . RQKVAIXT . DVG . VSTLTKRJT . FAGDK 

I I I.I 1 

139 149 162 173 181 

B hM Nrxffnn for HPn prctflfftl 

.13 -8 * • • -1+1 +8 * ■• +13 

5 • -T-T-C-T-T-G-A-A-A-A-C-C-A-A-C-G-T-T-T-T-T-C-A-T-A-A-3 ' 
3 ' -A-A-G-A-A-C-T-T-T-T-G-G-T-T-C-C-A-A-A-A-A-C-T-A-T-T-S ' 



4 5 6 



Strand 1: 
StrandZ 



2 
-T- 



7 3 9 10 11 12 13 14 IS 
-T--T*-G--A--T--A--A--G--A 
C--A--A--A--A--A--C--T--A--T--T--C* -T--A-S' 
29 28 27 26 25 24 23 22 21 20 19 13 17 IS 

Rg_ 1. Amino acid and DNA base sequences in the Hin reccmbinase f amity (1. 34). (A) Amino aci 
sequence of fre 0NA-c«ncfing domain (the 52 cartnxyMerrninal residues) of trie Hin protein and c 
correspondincj regions in Gin, On, and Pin. Residues in boldface are identical- in ail far sequences c 
at least in three of the (bur. a Helices 1 to 3 as located in our Hin structure analysis are marked abev 
the Hri sequence. For crystaltography. this Hin fragment was synthesized manually or on an ABI O0> 
synthesizer by the scfid-phase method as described previously (fl). (B) Base sequence of the left DN» 
inversion she, hod. Nurnbenng is to either direction from the center of the inverted repeal Asterisks mar 
purine bases that are protected from rnethyiation by the binding of Hin. (C) The 14-bp synthetic /» 
half-site as axrystailfasd with the Hin 52-mer. Strand 1 of the duplex is numbered 2 through 15 to mate 
the right naif of the entire hod. site in (B). Bases in strand 2 ara numbered separately. For ease o 
reference, note that base n ri strand 1 is paired with base (32 - n) in strand 2. Base pairs wi b . 
referenced by strand 1 rxrnbers aicne: that is. "base pair 9" is understood to signify base pair G&CZ 
Ptosphates aVays are nurnbered accofdng to the base that follows: Phosphate P9 occtrs between T 
and G9, whereas across the hefix on the other strand, phosphate P24 tea between C23 and A24. Here* 
phosphate n on strarxi 1 5es occosite phosphate (33 - n) on strand Z Asterisks mark bases that wer 
• iodinated (or purposes of multipie iscmxphous replacement phase analysis. 
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two residues also are invariant among all 
four of che DNA inve reuses. 

By comparison with pairwise amino acid 
sequence identities of 66 to 80 percent in 
the entire protein sequences, the carboxyl- 
terminal peptides shown , in Fig. I A have 
somewhat fewer identities, 49 to 62 per- 
cent, but srill are obviously homologous 
proteins. The Hin interactions resemble 
those in the binding of DNA by the homeo- 
doraain in eulcaryotes. Indeed, as noted by 
Affolter et a/. {14), the amino acid se- 
quence of the Hin binding domain can be 



aligned with that of the homeodomain .of 
the Drosophila engrailed protein with a 27 
percent sequence identity, sufficient to sug- 
gest a class resemblance. 

We repon here the 2.3 A resolution 
crystal structure analysis of the 5 2 -amino 
acid carboxy I- terminal DNA-binding do- 
main of Hin complexed with a 14-bp DNA 
oligomer containing a half KixL binding site 
(Fig. LC). The Hin peptide forms a chree- 
a-helix core with an extended chain at 
each end. The core interacts with the major 
groove of DNA, whereas the flanking ami- 



Tabte 1. Summary of crystaJIographic analysis. Crystals of Hin domairVDNA complex were grown by 
vapor diffusion against 15 to 18 percent PEG 1500 as described elsewhere (75). The structure was 
determined initially to 3.2 A resolution by multiple isomorphous replacement (MIR). Heavy atom 
parameters were refined and MIR phases calculated with the program HEAVY (44). The initial MIR 

C map generated after solvent flattening (45) revealed clear density for B-fcrm DNA and for most of 
the protein backbone density. This map was improved further by refining heavy atom parameters 
against solvent-flattened phases (46) . After two additional cycles of phasing, solvent flattening and 
heavy atom parameter refinement, the final MIR map. with mean phasing figure of merit of 0 55 for 
data between 20.0 and 3-2 A, was used to build a model of the complex However, it still was difficult 
to fit the ammo acid sequence into many regions of the map. Only after phases were extended and 
modified to 2.8 A by the method of Zhang and Main (47) did the map show clear density for side 
chains of some "marker" residues. At that point, all residues could be fitted unambiguously with the 

C except of the final eight carboxyi-termlnal amino acids. Conventional positional refinement then 
was earned out to 2.8 A with X-PLOR (43, 49). To refine the model further against a new 2 3 A data 
set collected at -15CTC, rigid-body refinements were carried out in successive steps to 2 5 A After 
positional refinement and simulated annealing, the (2F 0 - FJ map was of sufficient quality to allow 
the last eight residues to be built into the minor groove of the 0NA, following a clear and continuous 
density. Electron density for residues Ser 1 « and Ser"* however, remained poorly defined 
Refinement was extended to Z3 A in four cycles of simulated annealing with X-PLOR prior to tightfy 
restrained fcVfactor refinement At the present stage of refinement, the agreement of the atomic 

C n***** t0 crystallographic data is fl- 0.228 for 8.0 to 2.3 A resolution data. Coordinates have been 
deposited with the Brookhaven Protein Data Bank and are available for immediate distribution 
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no- and carboxy I- terminal chains extend 
along cwo regions of che minor groove. The 
carboxyl-cerminal eight-residue cail of che 
Hin peptide crosses che phosphodiescer 
backbone and is inserted in che minor 
groove in a manner chac has noc heretofore 
been encountered in DNA-procein com- 
plexes. The crystal structure provides a 
virtually complete explanation of base spec- 
ificity experiments in solution, including 
mutant studies. 

Structure of the complex. The structure 
of the Hin-DNA complex was solved by 
multiple Isomorphous replacement with the 
use of three iodine derivatives (Table I). A 
typical section of che final (2F 0 - F c ) map 
is shown in Fig. 2. One 52-amino acid Hin 
domain binds to each 14-bp DNA half-site 
(Fig. 3). DNA helices consisting of the 13 
complete base pain are stacked end-to-end 
. in the crystal, as in figure 2 of (IS). The 
unpaired base T2 on strand I (Fig. 1C) 
swings up to make a Hoogsteen-like inter- 
action with base pair 3, G3-C29. At the 
other end, unpaired base A 16 on strand 2 is 
noc defined in che electron density map and 
presumably is disordered. The a axis of che 
crystal, che direction of stacking of cwo 
DNA helices, has a length at room temper- 
ature of 86.4 A =» 26 x 3J2 A. The 
12-©ase pair seeps along che helix produce a 
cotal rotation of 407" (average 33.9° per 
step), so chat che nonbonded interhelix 
junction between base pair 15 (A15T17) 
of one helix and base pair 3 (G3-C29) of 
che next requires a reverse twist of 360* — 
407° - -47". ' 

' The DNA half-sice is a standard B-DNA 
helix, with che usual local variation in helix 
parameters (16, 17). The helix is relatively 
straight and noc curved around che protein 
domain as in CAP, trp repressor, 434 re- 
pressor, and Met repressor (18-22), and has 
been proposed for the Fis-DNA complex 
(23, 24). However, an unusually large 
amounc of DNA surface area is contacted 
by Hin. Upon binding Hin peptide, the 
DNA half-rite monomer loses 1816 A 2 of 
scadc solvent-accessible surface area. 

The DNA half-site contains a short run 
of five AT base pain (numbers 4 chrough 8) 
that could be regarded as a segment of 
A-cract DNA. Three frequent characteris- 
tics of A-cract DNA are a straight and 
unbent helix axis, narrow minor groove, 
and large propeller twist, large enough for 
the formation of bifurcated hydrogen bonds 
within che major groove between adjacent 
base pain (25-27). However, in che Hin- 
DNA complex the minor groove maintains 
a uniform width of approximately 6.5 to 8.5 
A (minimal P-P atom separation across che 
groove, less 5.8 A for two phosphate group 
radii), rather than che 3.5 co 4.5 A typical 
of most A-tracts. Propeller twist is large all 
along the Hin-DNA complex, averaging 
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— 16°, buc is not systematically larger in che 
' A-A-A-A-A region than elsewhere. Hence 
the five AT base pairs do not cons acute a 
classical A-rxacc structure, perhaps because 
of binding to Hin. 

Overall protein folding. The 52-amino 
acid DNA-binding domain of Hin consists 
of a compact bundle of "three a-helices, 
with extended arruno-cerminal arm and car- 
boxy l-terminal tail (Fig. 3). ct-Helix 1 
(Glu 14 * to Lys 1 *) lies parallel to the axis of 
che DNA. a-hclix 2 (Arg 167 to Phe ,6S> ) is 
nearly anriparallel to helix 1 with an angle 
of - 25° between helix axes, and a-helix 3 
(Val 173 to Phe 180 ) is inserted in the major 
DNA groove parallel to che base pairs (not 
to the floor of the groove itself ). The HTH 
.motif formed by helices 2 and 3 is similar to 
those found in other prokaryoric regulatory 
DNA-binding proteins. The Hin HTH re- 
gion can be superimposed on equivalent 
regions of Fis (23, 24) and X repressor (28) 
with rooc- mean- square deviations in Cat 
atomic positions of only 0.61 and 0.76 A, 
respectively. 

All three a helices are amphipathic, 
with hydrophobic residues packed righdy 
against one another in a hydrophobic core 
(Hg. 3A). lie 152 and Leu 136 of helix 1 
incerdigicate wich Leu 165 and Phe 169 of he- 
lix 2. Val l7 \ Leu 176 , and Phe 180 of helix 3 
also point into che hydrophobic core, 
which is delineated by the blue polypeptide 
backbone regions in Fig. 3 A. At the bot- 
tom in that view, lie 14 * on the arnino- 
cerrninal arm closes the hydrophobic pock- 
et- These hydrophobic interactions appear 
co be the main forces stabilizing che folding 
of the Hin protein. They also are strongly 
conserved among che other DNA inver- 
cases. Gin, Cin, and Pin (Fig. 1A) , sup- 
porting the inference' that all four of these 
proteins are folded in the same way. Hydro- 
phobic interactions are supplemented by 
hydrogen bonds between side chains: 
Arg 162 (invariant among the four inver- 
cases), at the beginning of helix 2, is hydro- 
gen-bonded to main chain carbonyl oxy- 
gens of Phe 180 (che final residue of helix 3) 
and Pro 181 . Most charged side chains of the 
protein are either in contact with DNA or 
exposed to the solvent. 

Major groove protein-DNA interac- 
tions. a-Helix 3 is the DNA recognition 
helix for the Hin protein; helices 1 and 2 
arc too far from the DNA to permit direct 
interactions. Only Gin 163 at the amino 
terminus of helix 2 makes an indirect DNA 
contact through a hydrogen bond co residue 
Tyr 177 {invertase invariant) in helix 3, 
which in cum concacts phosphate PI 9 (Fig. 
3B). Five interactions between helix 3 and 
DNA backbone phosphates position che 
recognition helix properly, and cwo-amino 
acid side chains, and Arg 17a , make 

specific bonds to the edges oT base pain 
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G9-C23 and A10-T22, as detailed below, lc 
is significant that four of che eight amino 
acids in helix 3 arc completely invariant 
among che four DNA tnvcrtases, and an- 
other three are semi-invariant, with the 
. same residue in three sequences and a close- 
ly related one in the fourth. 

The five nonspecific interactions wich 
DNA backbone phosphates are depicted in 
Fig. 3B. The side chain of Tyr 177 reaches up 
to phosphate P19 on one edge of che major 
groove, whereas Tyr 179 on che other side of 
che a helix reaches down to phosphate P8. 
direcdy across the groove on the other wall. 
v One of che terminal — NH 2 groups of the 
Arg 173 side chain donates a hydrogen bond 
to the remaining oxygen of phosphate P8. 
The side chain of Thr 175 and the main 
chain amide of Gly 172 anchor helix 3 even 
further by donating hydrogen bonds to 
phosphate P9. In contrast co other HTH 
DNA-binding proteins, all of these nonspe- 
cific anchoring concaccs to DNA phos- 
phates are made by residues of helix 3; the 
"three-point contact" made by residues 
Gly 172 , Thr 175 , Tyr 177 , and Tyr 1 ? efficient- 
ly braces helix 3 against che opening of che 
major groove of DNA in a position co make 
specific recognition interactions. 

Specific base sequence recognition uses 
only two Hin side chains, Ser m and 
Arg 178 , and in pan involves che mediation 
of water molecules (Fig. 4) in a manner that 
has been proposed for the ap repressor (29) . 
The side chain of Ser m donates a hydrogen 
bond to the N-7 atom of A10. One turn of 
a helix away from this position, the termi- 
nal -NH 2 of Arg 178 that is not involved 



with P8 donates a similar hydrogen bond to 
the N-7 of G9. The Arg 178 e-irnino nitro- 
gen donates another hydrogen bond co 
bound water molecule 1, which in cum 
donates a bond to the CM of 772, essen- 
tially "reading" che fact chac this base pair is 
indeed A10-T22 and not a G-C pair. The 
other proton of this water molecule bonds 
to water molecule 2, which receives anoth- 
er hydrogen bond from che N-6 amine of 
A21 (recognizing this as an AT pair and 
noc G-C) , and donates hydrogen bonds to 
N-7 of the same base and to the main chain 
carbonyl oxygen of Ser 174 . Ser 174 is invari- 
ant among all four DNA invertases. Arg 178 
is substituted only by Lys, and when this 
happens, two basic side chains always ap- 
pear adjacenc.at positions 178 and 179 (Fig. 
I A) v meaning that some of che hydrogen 
bonds of che Hin scructure could well be 
preserved. Both of the bound waters have 
che tetrahedral geometry expected for water 
molecules, donating two hydrogen bonds 
and accepting two others. The fourth bond 
to water 1 , not shown explicitly in Fig. 4B, 
must be to anocher water molecule noc well 
localized in che electron-density map. 

All of; these specific interactions are 
drawn in Fig. 4, A and B. Together they 
recognize base pairs 9, 10, and 11 of the 
half recombination site. Indeed, the bind- 
ing of Hin is particularly sensitive to alter- 
ations of base pairs 9 and 10 (13). Dimethyl 
sulfate modification of che N-7 position of 
G9 inhibits Hin binding (8, 10). Mcchyl- 
arion at che N-6 position of -A 10 by the 
dcoxyadenosine methylase of Salmonella de- 
creases binding affinity (13). Hin binding 




Fig. 2. Stereo view ol the (2F a - FJ etectron density map at 2.3 A resolution (blue contours) showing 
portions of DNA base pairs 8 to 1 1 (top) and the region of heiix 3 around Arg 1 *■ and Tyr ,T » (marked). 
Contour level is W. The protein, framework is in red, and ONA is in green. 
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also is strongly and adversely affected by 
mu canons at these sites, being inhibited by 
substitution of C at positions 9 and 1 1 or 
C cither C or T at position 10. 



Minor groove protein- DN A interac- 
tions — the amino- terminal arm. Genetic 
and biochemical studies have demonstrated 
that, contacts made by the amino- terminal 




Bg. 3. (A) Stereo diagram 
of the Htn-ONA complex. 
ONA is in blue stick bonds. 
The path of the Hin poly- 
peptide chain is shown as 
a flexible tubing; orange in 
general, but blue for hydro- 
phobic residues. . Side 
chains that contact the 
ONA are drawn in orange 
and labeled in yellow. Note 
the amino-terminal arm 
within the minor groove al 
lower right (Gr/ ,M -A/g Ma ). 
the carboxyl-terminal tail in 
the minor groove at upper 
left (ile'^-Asn 100 ). and he- 
lix 3 nested in the major 
groove. Pink spheres are 
two bound water mole- 
cules involved in protei/v 
ONA contacts within the 
recognition site. Hydrogen 
bonds are in green. (B). 
Schematic drawing of the 
complex, in the same view- 
as the stereo. Helices are 
numbered 1 to 3. and key 
side chains that interact 
with the ONA are identified. 
Open dots along back- 
bone ribbons locate CV 
atoms. Phosphates 8. 9. 
and 19 are indicated spe- 
cifically on the ribbons. 
Base pairs 9 to 11 are 
shown in stereo close-up in 
Fig. 4A. base pairs 4 to 8 in 
Fig. 5B, and pairs 9 to 15 in 
Fig. 7. 
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arm of the Hin DNA-binding domain are at 
least as critical to DNA recognition as are 
those of helix 3 (10, 12, 13). Indeed, 
merely deleting Gly 139 and Arg 140 from the 
Hin DNA-binding peptide is sufficient to 
abolish specificity of binding to /uxL (12). 
These residues are invariant in all of the 
DNA invertases. 

The amino-terminal arm, Gly 1J9 -His l47 t 
adopts an extended conformation (Fig: 5). 
Clear electron density (Fig. 5 A) allows 
Gly U9 and Arg 140 to be located unambigu- 
ously within the minor groove. The 
e-imine of the Arg 140 side chain donates a 
hydrogen bond to N-3 of A26. The unusu- 
ally high 26° propeller twist of base pair 6 
(T6-A26) permits a second hydrogen bond 
from the main chain amide of Arg 140 to the . 
Q-2 of T6. If a G-C pair were to be 
substituted, this latter bond would become 
impossible, and the N-2 amine of guanine 
would push the Arg 140 side chain away. 
Although the neighboring A-T base pairs 
are less propeller- twisted, the ability of an 
A*T pair to adopt such a large propeller 
contributes to the recognition, process (25- 
27). Gly 139 rests in close van dcr Waals 
contact with base pair 5; the main chain 
Cot atom of that residue is only 3.4 A from 
the O-Z of T5 and 4.1 A from the C-2 of 
. A27. Introduction of an amine group at 
that locus, as in guanine, would push the 
Hin polypeptide chain up and away from 
the floor of the minor groove. at that point. 
Each base pair aubsriturion of A'T by G*C 
at positions 5 and 6 abolishes the binding 
affinity of Hir/fT|/. Indeed, A'T base pain 
at positions 5^trfd 6 are universally present 
in all of the recombination sites of various 
enteric inversion systems: hixL and hixR; 
gbcL and gucR; dxL and cixR; and pixL and 
pixR (I) (Fig. 6). Biochemical footprinting 
experiments also show that both intact Hin 
and the Hin peptide protect adenines 5 and 
6 from methylaribn (JO). 

Pro 141 arches across one wall of the 
minor groove, and the e-imino of Arg 142 is 
hydrogen-bonded to phosphate P8, an in- 
teraction that may be important in direct- 
ing the amino- terminal arm into the minor 
groove of the DNA: Hin, Gin, Cin, and 
Pin all have Pro at position 141 or 142, 
followed immediately by a basic Arg or Lys. 
A significant role may be played by Ile ,+ \ 
which interacts with the hydrophobic core 
formed by packing helices 1 to 3 against one 
another. He 144 may restrict movement of 
the amino-terminal arm, thus bringing 
Arg 142 into proximity to phosphate P8. In 
other DNA invertases, position 144 is al- 
ways a bulky hydrophobic side chain, cither 
Leu or Tyr (Fig. 1A). 

Minor groove protein-DNA interac- 
tions — the caxboxyl- terminal tail. The car- 
boxy I- terminal tail of the Hin polypeptide 
crosses the phosphodiester backbone at the 
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Fig. 4. (A) Stereo close-up of the interaction between ONA and helix 3 in the 
major groove, viewed approximately down the ONA helix axis. Base pair 
T1 1 -A21 is nearest the viewer, with A10-T22 beneath, and G9-C23 farthest 
away. Strand 1 backbone continues to lower right through P9 and P8. Helix 
3 is drawn as a smooth curve, with specific depiction, from top to bottom, 




Sex 174 



Anna 



C23 | N4 

of residues Gry 17 * Ser 17 \ Thr 17S . Arg t7B . and Tyr 1 ™. Two shaded spheres 
are bound water molecules. (B) Schematic of specific base pair recognition 
invoMng Ser 174 . Arg 1T °. and two bound water molecules. Along base 
edges, "a" marks a hydrogen bond acceptor (ring N-7.or carbonyi 0-4 or 
0-6) and "d"" marks a hydrogen bond donor (N-4 or N-6 amine group). 



outer edge of che recombination sice and 
chen follows the minor groove back toward 
the center of the 13-bp DNA helix (Figs. 3 
and 7). The final sue residues of the Hin 
polypeptide, Ile I83 -Lys I86 -Lys ia7 -Arg lM - 
Met ia9 -Asn 190 , adopt an extended confor- 
mation and lie within the minor groove, 
but the side chains themselves make no 
contacts with the floor of the groove. In- 
stead, they point outward, with the poly- 
peptide backbone resting against base edg- 
es. At the point where the final six-amino 
acid residues dip into the minor groove,, the 
main chain CO of He 185 hydrogen bonds to 
che N-Z of G14. The main chain NH of 
Lys 187 bonds to the 0-2 of T20, and a little 
farther along, the main chain amide of 
Asn 190 interacts with che 0-2 of T22 while 
the side chain of the. carboxyl- terminal 
residue bonds to the NO of A 10. These 
interactions may be responsible for che large 
propeller twist and roll angles of base pain 
10 and 12 and the -16° bend of DNA 
toward the major groove. Consistent with 
the interactions just discussed, the N-3 of 
A 10 is partially protected from dimethyl, 
sulfate attack by Hin binding (10), In 
addition. Mack et aL (I f ) have noted that 
a Hin peptide lacking the last eight resi- 
dues, when modified with EDTA-Fe, 
cleaved DNA with reduced efficiency as 
compared with a peptide containing the 
complete carboxyl terminus. This portion 
of the chain is variable among the DNA 
invcrtascs; whereas Hin has six final amino 
acids, Gin has ten, Cin has three, and Pin 
has a lone Lys (Fig. 1A). This carboxyl- 
tcrminal tail is presumably supportive but 
not essential. 

The hydrogen-bonded extension of the 
last six residues along the floor of the minor 
groove recalls AT-spccific binding of minor 
groove drugs such as nctropsin and distamy- 
cin (30, 3/). Such binding involves an 
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element of base specificity: if any of che base 
pairs 10 to 13 were G-C rather than A-T, 
then the tail of the Hin peptide would be 
pushed away from the floor of the groove. 
In another context, it has been proposed 
(32, 33) that an extended polypeptide con- 
taining repeats of SPKK (34) sequence may 
interact with DNA minor groove in a sim- 
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Qar fashion to netropsin, with main chain 
amide nitrogen forming hydrogen bonds 
with base pairs in the minor groove. Our 
structure seems to provide a concrete exam- 
ple of such a model. 

The molecular basis of specificity. Two 
aspects of the Hin system make it especially 
conducive to an understanding of the cf- 
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Fig. 5. Stereo views of the amino- terminal arm of the Hin peptide, in the minor groove of ONA. (A) 
The refined (2f 0 - electron density map (blue framework) with minor groove vertical, showing 
Arg'^-Pro^-A/g'v looping over the phosphate backbone toward the right. Protein is in red. and 
ONA ts in green. (B) View along the minor groove, from the top in (A), showing the entire "A-fracr 
region, from T4-A28 at bottom through T8-A24 at top. This view is approximately thai of the tower 
fjaif o/ Fig. 38. The ribbon extending from center toward upper right is the Hin peptide region 
Gly^-A/g^.Pro^.Arg 14 *. with side chains drawn explicitly. 
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Fig. 6. Base sequences in 
enteric bacterial inversion 
sites: the left and right inver- 
sion sites from the H in inver- 
sion system 0/ Salmonella, 
the Gin inversion elements 
from bacteriophage Mu, Ctn 
from phage P1, and Pin 
from the e14 prophage 0/ 
Escherichia cols (/). Onty 
one chain from each com- 
plex is shown; the other is 



hixt 
hUft 
gat 
gu>7 
cut 
axfl 
pat 
pixR 



Left hatf-sfta Rigrn hail-sita 

11 -« -1*1 .< #ll 

T-T-C-t-T-G-A-X-A-A-C-C-A-A-C-C-T-T-T-T-T-O-A-T-A-A-3* 
5 ' -T-T-T-T-C-C-T-T-T-T-C-C-A-A-G-G-T-T-T-T-T-O-A-T-A-A-J ' 
T-T-C-C-T-C-T-A-A-A-C-C-C-A-C-C-T-T-T-T-G-O-A-T-A-A-3- 
'-^^-^^-C-T-A-A-A-C-C-a-A-C-C-T-T-T-T-C-O-A-T-A-A-J* 
T-T^C-T-C-T-T-A-A-A-C-C-A-A-C-G-T-T-T-A-C-O-A-T-T-O-J' 
T-T-C-T-C-T-T-A-A-A-C-C-A-A-O-G-T-l-T-T-O-O-A-T-A-A-J* 
T-T-C-T-C-C-C-A-A-A-C-C-A-A-G-G-T-T-T-T-C-O-A-O^A-O-J' 
5* -T-T-C-T-C-C-C-A-A-A-C-C-A-A-C-G-T-T-T.-A-TH3-A-A-A-A-J ' 
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complementary as in Fig. " 

IB. Each of the eight sites is built from two roughly symmetrical half-sites. Asterisks at bottom 
tndjcate positions that are especially important in site recognition by invertases. 



fects of sequence on specificity. The first is 
the availability or* binding data on 39 dif- 
ferent base substitution mutations generat- 
ed by Hughes et al. (13). They constructed 
a symmetrized hixC sequence in which the 
left half is given the complementary se- 
quence to the right half shared by both kxxL 
and hixR and established that this symme- 
triied hixC binds. Hin fully as well as the 
wild-type hixL and hixR. (It is the hixC 
sequence that we used for crystallographic 
analysis.) They then constructed an ex- 



haustive set of symmetrical mutants in the 
two halves of hixC, varying each of the 13 
positions among all three of the other bases. 
Hence we have comply infnrm* tin™ nk^f 

^ c strcngcn °^ Hin bi nding with _e yjgry 
possible suT^le-b^g change in tKe^prim^ l 
toUjc3ucnce. Thcsecond favorable aspect 
is" the existence of four homologous DNA 
inversion systems: Hin, Gin, Cin, and Pin. 
Taken together, these provide 8 complete' 
recombination sites or 16 binding half-sites 
(Fig. 6). How far can our x-ray analysis of 



y ^th ffQ V*?°Z <°L occmence of at k ^y positions in bacterial inversion sites. Sequences 
are read in a 5'-to-3' d»recuon from the center of the inversion she as shown in Ra 6 Hence forlert 
o^er « not «r»™ iri Rg. 6. is tabulated. A/towed sS l deS^ 
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Natural hix. gix, cor. and pix sites 


Acceptable 
mutations in 
symmetrized 
hixC site (13) 


DNA 
groove 


G 
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T C 


5 
6 
9 
10 
11 
12 
13 


13 
2 
8 

2 


2 
1 
3 
14 

2 
.15 
14 


14 
15 

6 
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Not G. not C 
. Not G. not C 
Not C 

Not T, not C 

NotC 

NotC 

All equivalent 


Minor 
Minor 
Major 
Major 
Major 
* Minor 
Minor 
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the Hin-DNA complex account for this 
wealth of data, and provide a molecular, 
basis for DNA-protein specificity? 

The ■ Hin-DNA crystal structure shows 
that all three components of the Hin pep- 
tide, the amino- terminal arm, the HTH 
region, and the carboxyl-tcrminal tail, con- 
tribute to base sequence recognition. The 
phosphate backbone contacts of helix 3 
help position Hin on the DNA, but one 
could easily imagine that Hin could slide 
along the DNA in a nonspecific manner 
until it encounters the correct local base 
sequence. 

Interactions of Hin residues Ser m and 
Arg I78 f both direct and through intermedi- 
ate water molecules (Fig. 4B) place restric- 
tions on base pairs 9, 10, and 11. Some 
laticude in base sequence is possible if dif- 
ferent arrangements of hydrogen bond do- 
nors and acceptors arc permitted. Those 
rearrangements that are possible without 
losing the total number of hydrogen bonds 
are shown in Fig. 8. Base 9 is restricted to 
being a purine (G or A) by virtue of the 
hydrogen bond donated to ring atom N-7. 
In complete agreement with this model, of 
" the 16 half-sites shown in Fig. 6 and listed 
in Table 2, G occurs 13 times at position 9 
and A occurs 3 rimes. No pyrimidinc is ever 
found at that locus. Replacement of G at 
position 9 by A or T (and C at position -9 
by T or A) in the mutant studies of Hughes 
et aL (13) is acceptable, but C at position 9 
reduces binding significantly. Our crystal 
structure shows why: G f A, or T at position 
9 offer a hydrogen bond acceptor to Arg 178 , 
whereas C offers a N-4 amine donor in- 
stead, and hence is disfavored. . Even T is 
less favorable, because it positions the hy- 
drogen bond acceptor differently and par- 
tially blocks it with its own C-5 methyl 
group. 

Position 10 also must be a purine for the 
same reason as 9. The choice in 14 of the 
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16 half : site sequences is A 10, resulting in 
. T22 at the ochcr end of the base pair,- with 
a hydrogen bond-accepting 0-4 acorn (Fig 
8A). However, G10-C22 is an acceptable 
minority choice in two sequences, and in 
that case the N-4 amine of C22 must 
donate a hydrogen bond co water I (Fig. 
8B). Water 1 then would donate a hydn> 
gen bond co another water molecule not 
shown here. 

Base pair 11 is more variable than mighc 
have been expected, and for an interesting 
reason. Water molecule 2 in Fig. 8A accepts 
a hydrogen bond from the N-6 of A2i and 
donates a bond to N-7, but all that is 
required for a hydrogen balance is that this 
water molecule should donate one hydrogen 
and accept another. The two base atoms 
could just as easily be thymine CM and 
adenine N-6 as in Fig. 8A or cytosine N-4 
and guanine 0-6 as in Fig. 85\ The only 
combination not permitted would be gua- 
nine at position 21, with dual acceptors N-7 
and 0-6. For in chat case, water 2 would not 
have enough protons to form the bond with 
the main chain carbonyl of Ser I?J *. In other 
words, the requirement on the strand 1 side 
of base pair II is "not-C/ and indeed this 
requirement is borne out in Table 2 both by 
the invertase family sequence comparisons 
and mutational substitutions. 

The direction of the hydrogen bond 
between water molecules— water 1 donat- 
ing to water 2-^actualiy is firmly cstab- 
lished. As Fig. 8C shows, if water 2 were to 
donate a bond both to water 1 and to the 
Ser 174 main chain, carbonyl, then the two 
positions on base pair 1 1 would have to be 
adjacent donors, and no base pair shows 
this behavior. The full pattern of hydrogen 
bonds can be maintained only by arrange- 
ments as in Fig. 8, A and B. 

A*T base pairs arc favored at positions 
12 and 13 because of a netropsin-Iike ex- 
tension of the carboxyl-terrninal six Hin 
residues down the floor of the minor groove 
(Fig. 7). Table 2 shows that A-T pairs 
indeed are overwhelmingly fevered at these 
two loci, although mutant studies show 
position 13 to be more permissive. It could 
be that the similarity of sequences at this 
point among all of the enteric recombinases 
is a matter of evolutionary divergence from 
a common ancestral sequence, rather than 
convergence on function. 

At the other end of the recognition 
domain, positions 4, 5, and 6 universally 
are AT base pairs in all of the DNA 
inversion sites, and G-C pain are strongly 
disfavored at positions 5 and 6 in the 
mutant studies. The x-ray structure shows 
the reason: Hin residues GIy U9 and Arg J< *° 
are so. intimately linked to the floor of the 
minor groove that there simply is no room 
for the C-2 amine group of guanine. As 
noted above, Gry ,J9 and Arg»«> are abso- 
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reversed erxMor-end without a/terinq the DMemnl Z^Z^tT^ ' 8338 1 1 «»*» 
to edge. (B) Hydrogen bend ^SSTtatS^^^ 
G11-C21. However, base 21 could not baGw»h ^1. T * **** P a,r 11 - allowing- 
between water 2 and ihe Se/»^SxS HeSeCn^l if ? P '°? ^ *™ ^ 

the central base pair can Wap^ffi b£SS £f 1 d,s '^- ^ *xap.c, * 

AtO-TZ. but base 10 is still Quired to tea ^ (C, T^r^^^* * 
water rrolecules cannot be reversed, for then. .S^oK^ connect^g me N« 
P* have tocon^e ^ ^^^^^^^^^ 
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Fig. 9. Optimal base se- 
quence for enteric inver- 5 6 7 8 9 10 
sion half-sites. Numbers fA /T _, . 
mart the distance out ^ /1 ^ A/ ^^ N ^P u ^>~A^ruDt<^A/THweak A) 
from, the center o< the site as in Ftg. 6. A/T - an A-T base pair in either orientation. N - any base. 
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lutely essential for sequence-specific bind- 
ing of Hin to its DNA site. 

In summary, the recognition element of 
the enteric inversion half-sites appears to 
involve two A-T base pairs (5 and 6) 
recognized by amino acid residues Giy U9 
and Arg 140 in the minor groove, two non- 
specific base pairs (7 and 8) , and then a five 
base sequence (9 to 13) recognized by helix 
3 and the carboxyl-terrninal tail in major 
and minor grooves, respectively. The opti- 
mal binding sequence is shown in Fig. 9. 
The Hin dimer has -100-fold higher bind- 
ing affinity to the full recombination site 
than does the Hin peptide binding to a 
half-site, and hence cooperative interac- 
tions by the Hin dimer may also contribute 
to recognition. 

Hin-DNA versus other HTH DNA- 
binding complexes. The HTH mori/ occurs 
frequently in DNA-binding proteins, in- 
cluding prokaryotic regulators (18, 20. 28, 
35), eukaryotic homeodomains (36-38), 
eukaryotic transcription factors such as 
HNF-37 (39), the Oct- 1 POU-spccific do- 
main, POUs (40), the third tandem repeat 
of the cMyb protein (4 / ). and the globular 
domain of histone Hi, CHS (42). Com- 
plexes of these proteins with DNA show 
two principal variants that arc represented 
in Fig. 10 by the X repressor and the 
engrailed homcodomain. In all cases, rec- 
ognition helix 3 fits into the major groove 
and helix 2 runs across the width of the 
groove. In homcodomain structures, helix I 
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lies essentially anriparaJlel.to helix 2, which 
also spans the width of the major groove. 
The residues preceding helix i are posi- 
tioned to interact with the minor groove, 
and recognition helix 3 is oriented along 
the floor of the major groove. This partem 
is persistent; the duster of three helices is 
virtually superimposable from one homc- 
odomain complex to the next. In contrast, 
prokaryotic regulatory proteins (J 8, 20, 28, 
35) have similar dispositions of helices 2 
and 3, but with the exception of erp repres- 
sor, helix I is swung out and away from the 
DNA (Fig. I0A). Helix 3, on the other 
hand, tends to lie parallel to the edges of 
base pairs in the prolcaryoric regulators, 
rather than along the floor of the groove as 
in homeodomains. 

The present Hin-DNA complex is inter- 
mediate between these two structures. Rec- 
ognition helix 3 is parallel to base pair edges 
rather than to the groove itself, as with 
prokaryotic repressors, but helix 1 is nearly 
anriparallc! to 2, with its amino terminus 
extending toward the minor groove where a 
nonhelical chain continuation contributes 
interactions essential to specific recogni- 
tion. The alignment of helix 3 parallel to 
base pairs in Hin and repressors is function- 
al: comparison of the structures of 434 
repressor-operator, 434 Go-operator, X re- 
pressor-operator, and CAP- DNA complexes, 
shows thar amino acids at positions 1, 2, 
and 6 of the helix make specific contaco 
with bases in the major groove in each case 
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1 repressor 



Engraflod hom«<xjomaVi 



Fig. 10. Comparative interactions of a three^elix unit with the ONA double helix in the X repressor- 
operator complex (left). ^ Hin-ONA complex (cantor), and the engrailed homeodomain complex 
(right). In each case, the canooxyt-termtnai heOx of the three is inserted into the major groove 



(43). Similarly, Hin uses positions 2 and 6 
(3cr iu and Arg 17a ) for ics primary recogni- 
tion process. 

The disposition of helix 1 with respect 
to the DNA is not completely identical In 
Hin and homeodomains; in the latter, helix 
I crosses perpendicular to the walls of the 
fnajor groove, whereas Hin has helix 1 
aligned parallel with the overall DNA axis. 
The two loops connecting helices are short- 
er in Hin than in the homeodomain pro- 
^ teins and more like those of prolcaryoric 
repressors. In addition, the a helices them- 
selves are shorter than their homeodomain 
equivalents, especially^ helix 3, which in 
both yeast ct2 and Drosophiia engrailed ho- 
meodomains are 17 amino acids long. 
There also is a remarkable difference in 
£ minor groove binding by the sequence Arg- 
Pro-Arg in the amino-ternxinal arm of Hin 
arid in the engrailed homeodomain. In 
Hin, Arg 1 * 3 makes specific base contacts, 
whereas Arg 142 anchors the two-pronged 

fin* ^ ding co che P^P^te backbone^ 
° R^A. In the engrailed homeodomain, 

£ ^T-Pro^Arg 3 inserts both An> side chain* 
into the minor groove and makes specific 
base contacts, and it is the adjacent Thr* 
™t interacts with phosphate backbone, 
{hus, che iamc three—amino acid recogni- 
tion module can interact with the minor 
groove in two profoundly different ways to 

C * e v n f erace contacts essential for protein- 
DNA binding. — 

In another significant difference, the 
amino-cerminal arms of homeodomain pro- 



teins run along the minor groove in the 
direction of the HTH unit itself, whereas 
the arruno-tcrminal arm of the Hin protein 
runs in an opposite direction (Figs. 3 A and 
9), toward what would be the center of the 
recombination. site. Model-building of an 
intact recombinasc bound to a complete hix 
site suggests that the positions cleaved by 
Hin during recombination are located on 
the opposite face of hix from the DNA- 
binding domains. It is likely that residues 
preceding the amino- terminal arm within 
an, intact Hin protomcr may continue run- 
ning along the minor groove around the 
DNA and link with the catalytic domain 
that presumably is positioned over the 
cleavage site. 
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contains large identifiable cells and is 
consequently, like Hirudo and Aplysia, 
a favourite preparation for studying 
neural mechanisms at the cellular level, 
and in particular for studying isolated 
neurons in culture. 

helix-coil transition See random coil 

helix-destabilising proteins (single- 
stranded binding proteins) Proteins in- 
volved in DNA replication. They bind 
cooperatively to single-stranded DNA, 
preventing the reformation of the 
duplex and extending the DNA back- 
bone, thus making the exposed bases 
more accessible for base pairing. 

helix-loop-helix A motif associated with 
transcription factors, allowing them to 
recognise and bind to specific DNA 
sequences. Two a-helices are separated 
by a loop. Examples: myoblast MyoDl, 
c-myc, Drosophila genes daughterless, 
hairy, twist, scute, achaete, asense. Not the 
same as helix-turn-helix. 

helix-turn-helix A motif associated with 
transcription factors, allowing them to 
bind to and recognise specific DNA 
sequences. Two amphipathic a-helices 
are separated by a short sequence with a 
P-sheet One helix lies across the major 
groove of the DNA, while the recogni- 
tion helix enters the major groove and 
interacts with specific bases. An example 
in Drosophila is the homeotic genefushi 
tarazu, that binds to the sequence 
TCAATTAAATGA. Not the same as 
helix-loop-helix. " 

helodermin Seeexendin. 

helospectin See exendin. 

helper factor A group of factors appar- 
ently produced by helper T-lymphocy tes 
that act specifically or non-specifically to 
transfer T-cell help to other classes of 
lymphocytes. The existence of specific 
T-cell helper factor is uncertain. 

helper T-cell See T-helper cells. 

helper virus A virus that will allow the 
replication of a co-infecting defective 
virus by producing the necessary pro- 
tein. 



hema-, hemo- See haema, haemo. 
heme Seehaem. 

hemicellulose Class of plant cell-wall 
polysaccharide that cannot be extracted 
from the wall by hot water or chelating 
agents, but can be extracted by aqueous 
alkali. Includes xylan, glucuronoxylan, 
arabinoxylan, arabinogalactan 17, 
glucomannan, xyloglucan and galacto- 
mannan. Part of the cell-wall matrix. 

hemidesmosome Specialised junction 
between an epithelial cell and its basal 
lamina. Although morphologically sim- 
ilar to half a desmosome (into which 
intermediate cytokeratin filaments are 
also inserted), different proteins are 
involved. 

hemizygote Nucleus, cell or organism 
that has only one of a normally diploid 
set of genes. In mammals the male is 
hemizygous for the X-chromosome. 

Hensen's node (primitive knot) Thicken- 
ing of the avian blastoderm at the 
cephalic end of the primitive streak. 
Presumptive notochord cells become 
concentrated in this region. May well be 
a source of retinoic acid that is acting as a 
morphogen in the developing embryo. 

heparan sulphate (glycosaminoglycan) 
Constituent of membrane-associated 
proteoglycans. The heparan sulphate- 
binding domain of NCAM is proposed to 
augment NCAM-NCAM interactions, 
suggesting that cell-cell bonds mediated 
by NCAM may involve interactions 
between multiple ligands. The putative 
heparin-binding site on NCAM is a 28 
amino acid peptide shown to bind both 
heparin and retinal cells, as well as to 
inhibit retinal cell adhesion to NCAM. 
This strengthens the argument that that 
this site contributes directly to NCAM- 
mediated cell-cell adhesion. 

heparin Sulphated mucopolysaccharide, 
found in granules of mast cells, that 
inhibits the action of thrombin on fibrin- 
ogen by potentiating anti-thrombins, 
thereby interfering with the blood- 
clotting cascade. Platelet factor IV will 
neutralise heparin. 



200 



201 



leucocytosis 



oxygen, and thus 
;en-fixing enzyme, 
; oxygen sensitive. 

of Gram-negative 
La. Most species are 
ins, causing pneu- 
e.g. Legionnaires' 
jr an outbreak in 
st members of an 
inion. 

ige protein of the 
\er legumes. 

Hg cells. 

muscle analogue 
oponin. Two sub- 

and C, the latter 
>mologous to calm- 
C. 

anovsky-type stain; 
\d acidic dyes used 
s and which differ- 
as classes of leuco- 



>e caused by proto- 
e genus Leishmania. 

intracellularly in 
is forms of the dis- 
;pending upon the 
i particular visceral 
-azar) and muco- 
asis. 

nily of non-onco- 
tat cause "slow dis- 
racterised by hori- 



sIAc 

; > a-o-GlcN Ac 

c > a-D-GlcNAc 

IcNAc 

ilNAc 

c 

t-Gal > a-t>Gal 
nGalNAc 

acetyl neuraminic acid 
; Man = mannose. 



zontal transmission, long incubation 
periods and chronic progressive phases. 
Visna virus is in this group, and there are 
similarities between visna, equine infec- 
tious anaemia virus and HIV. 

lentoid Spherical cluster of retinal cells, 
formed by aggregation in vitro, that has a 
core of lens-like cells inside which accu- 
mulate proteins characteristic of normal 
lens. The cells concerned derive from 
retinal glial cells. 

Lepore haemoglobin Variant haemo- 
globin in a rare form of thalassaemia; 
there is a composite 8-P chain as a result 
of an unequal crossing-over event. The 
composite chain is functional but syn- 
thesis ed at a reduced rate. 

leprosy Disease caused by Mycobacterium 
leprae, an obligate intracellular parasite 
that survives lysosomal enzyme attack 
by possessing a waxy coat. Leprosy is 
a chronic disease associated with 
depressed cellular (but not humoral) 
immunity; the bacterium requires a tem- 
perature lower than 37°C, and thrives 
particularly in peripheral Schwann cells 
and macrophages. Only humans and the 
nine-banded armadillo are susceptible. 

leptonema See leptotene. 

Leptospira Genus of spirochaete bacteria 
that cause a mild chronic infection in rats 
and many domestic animals. The bac- 
teria are excreted continuously in the 
urine, and contact with infected urine or 
water can result in infection of humans 
via cuts or breaks in the skin. Infection 
causes leptospirosis or Weil's disease, a 
type of jaundice, which is an occupa- 
tional hazard for sewerage and farm 
workers. 

leptospirosis Weil s disease, caused by 
infection with Leptospira. 

leptotene Classical term for the first stage 
of prophase I of meiosis, during which 
the chromosomes condense and become 
visible. 

Lesch-Nyhan syndrome A sex-linked 
recessive inherited disease in humans 
that results from mutation in the gene for 



the purine salvage enzyme HGPRT, 
located on the X-chromosome. Results in 
severe mental retardation and distress- 
ing behavioural abnormalities, such as 
compulsive self-mutilation. 

lethal mutation Mutation that even- 
tually results in the death of an organism 
carrying the mutation. 

LETS (large extracellular transforma- 
tion /trypsin-sensitive protein) Origi- 
nally described as a cell surface protein 
that was altered on transformation in 
vitro; now known to be fibronectin. 

Leu enkephalin A natural peptide 
neurotransmitter; see enkephalins. 

leucine (leu; L; 2-arnino-4-methyl- 
pentanoic acid; 131 D) The most abund- 
ant amino acid found in proteins. Con- 
fers hydrophobicity and has a structural 
rather than a chemical role. See Table 
A2. 

leucine aminopeptidase An exopep- 
tidase that removes neutral amino acid 
residues from the N-terminus of pro- 
teins. 

leucine zipper Motif found in certain 
DNA-binding proteins. In a region of 
approximately 35 amino acids, every 
seventh is a leucine. This facilitates 
dimerisation of two such proteins to 
form a functional transcription factor. 
Examples of proteins containing leucine 
zippers are products of the proto- 
oncogenes myc, fos and jun. See also 
AP-1. 

leucinopine (dicarboxypropyl leucine) 
An analogue of nopaline found in crown 
gall tumours (induced by Agrobacterium 
tumefasciens) that do not synthesise 
octopine or nopaline. 

leucocidin Exotoxins from staphylococ- 
cal and streptococcal species of bacteria 
that cause leucocyte killing or lysis. 

leucocyte (USA leukocyte) Generic term 
for a white blood cell. See basophil, 
eosinophil, lymphocyte, monocyte, neu- 
trophil. 

leucocytosis An excess of leucocytes in 
the circulation. 
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Z scheme of photosynthesis A sche- 
matic representation of the light reac- 
tions of photosynthesis, in which the 
photosynthetic reaction centres and elec- 
tron carriers are arranged according to 
their electrode potential (free energy) 
in one dimension and their reaction 
sequence in the second dimension. This 
gives a Z shape, the two reaction centres 
(of photosystems I and II) being linked 
by the photosynthetic electron transport 
chain. 

Z-disc Region of the sarcomere into 
which thin filaments are inserted. Loca- 
tion of a-actinin in the sarcomere. 

Z-DNA See DNA. 

zeatin A naturally occurring cytokinin t 
originally isolated from maize seeds. Its 
riboside is also a cytokinin. 

zebrafish Brachydanio rerio; fish with a 
transparent embryo making it possible 
to follow progeny of single cells until 
quite late stages of development. This, 
together with the availability of mutant 
lines, makes it an important preparation 
for the study of vertebrate cell lineage. 



zeta potential The electrostatic potential 
of a molecule or particle, e.g. cell meas- 
ured at the plane of hydrodynamic 
slippage outside the surface of the mole- zona pellucida 
cule or cell. Usually measured by electro- cellular laver su 
phoretic mobility. Related to the surface 
potential and a measure of the electro- 
static forces of repulsion the particle or 
molecule is likely to meet when encoun- 
tering another of the same sign of charge. 
See cell electrophoresis. 



levels of ca. 0.1 mM; but intracellular 
levels.must be much lower. 

zinc finger A motif associated with 
DNA-binding proteins. A loop of 12 
amino acids contains either two cysteine 
and two histidine groups (a "cysteine- 
histidine" zinc finger), or four cysteines 
(a "cysteine-cysteine" zinc finger), that 
directly coordinate a zinc atom. The 
loops (usually present in multiples) 
intercalate directly into the DNA helix. 
Originally identified in the RNA 
polymerase III transcription factor 
TFIIIA. 

zipper See leucine zipper. 

zippering Process suggested to occur in 
phagocytosis in which the membrane of 
the phagocyte covers the particle by a 
progressive adhesive interaction. The 
evidence for such a mechanism comes 
from experiments in which capped 
B-cells are only partially internalised, 
whereas those with a uniform opsonis- 
ing coat of anti-IgG are fully engulfed. 

ZO-1 High molecular weight protein 
(225 kD in mouse, 210 kD in MDCK cells) 
associated with zonula occludens (tight 
junction) in many vertebrate epithelia. 
Cingulin, which is distinct, is found in 
the same region. 



non- 



>na pellucida A translucent 
cellular layer surrounding the ovum of 
many mammals. 



Zigmond chamber See orientation 
chamber. 

zinc An essential "trace" element being 
an essential component of the active site 
of a variety of enzymes. Zn 2+ has high 
affinity for the side-chains of cysteine 
and histidine. Zinc is present in tissues at 



zone of polarising activity The small 
group of mesenchyme cells in avian limb 
buds that is located at the posterior 
margin of the developing bud and that 
produces a substance, possibly retinoic 
acid, which provides positional informa- 
tion to the developing limb bud. 

zonula adhaerens Specialised intercel- 
lular junction in which the membranes 
are separated by 15-25 run, and into 
which are inserted microfilaments. Sim- 
ilar in structure to two apposed focal 
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holoblastic Form of cleavage. 

holocarpic (Of fungi) having the mature 
thallus converted in its entirety to a repro- 
ductive structure. Compare cucarpic 

Holocene (recent) The present, post- 
Pleistocene, epoch (system) of the quatern- 
ary period. 

holocehtric Of chromosomes with diffuse 
centromere activity, or a large number of 
centromeres. Common in some insect 
orders (Heteroptera, Lepidoptera) and a few 
plants {Spirogyra, Luzula). 

Holocephali - Subclass of the chonorich- 
thyes, including the ratfish Chimaera. First 
found in Jurassic deposits, Palatoquadr ate 
fused to cranium {autostytic jaw suspen- 
sion). Grouped with elasmobranchs 
because of- their common loss of bone. 

holocrine gland Gland in which entire 
cells are destroyed with discharge of con- 
tents (e.g. sebaceous gland). Compare apo- 
crine GLAND, MEROCRINE GLAND. 

holoenzyme Enzyme/cofactor complex. 

See ENZYME, APOENZYME. - 



hologamod 



See oeme. 



Holometabola Those insects with a pupal 
stage in their life history. See endopterycota. 
Compare thysanoptera. 

holophytic Having plant-like nutrition; 
i.e. synthesizing organic compounds from 
inorganic precursors, using solar energy 
trapped by means of chlorophyll.- Effec- 
tively a synonym of photoautotrophic. 
Compare holozoic; see autotrophic. 

Holostei Grade of actinopterycii which 
succeeded the chondrosteans as dominant 
Mesozoic fishes. Oceanic forms became 
extinct in the Cretaceous, but living fresh- 
water forms include the gar pikes, Lepisos- 
teus, and bowfins, Amia. Superseded in late 
Triassic and Jurassic by teleostei. Tendency 
to lose canoine covering to scales. 

Holothuroidea Sea cucumbers. Class of 
echinodermata. Body cylindrical, with 
mouth at one end and anus at the other; 
soft, muscular body wall with skeleton of 



scattered, minute plates; no spines or pedic- 
illariae; suckered tube feet; bottom -dwellers, 
often burrowing; tentacles (modified tube 
feet) around mouth for feeding. Lie on their 
sides. 

holotype (type specimen) Individual 
organism upon which naming and descrip- 
tion of new species depends. See neotype, 

BINOMIAL NOMENCLATURE. 

holozoic Feeding in an animal-like 
manner. Generally involves ingestion of 
solid organic matter, its subsequent diges- 
tion within and assimilation from a food 
vacuole or gut, and egestibn of undigested, 
material via an anus or other pore. Compare 

HOLOPHYTIC. 

homeobox Conserved DNA sequence 
motif of -180 base pairs, encoding DNA- 
binding regions of many proteins, specifi- 
cally HMG boxes, al domains and 
homeodomains.,Most proteins containing 
these regions are transcriptional regulators, 
and a single amino acid alteration can 
change the promoters to which they bind, 
potentially altering the downstream activa- 
tion pattern of genes. First identified in 1984 
within several homeotic cenes of Drosophila, 
the homeobox product confers the helix- 
turn- helix motif upon a protein, giving it 
its DNA-binding properties (see dna- 

BINOINC PROTEINS, REGULATORY GENE). Mating- 

type genes in ascomycete fungi control 
sexual development and contain homeo- 
box motifs, and the rapidly evolving 
homeobox gene, Odysseus, is responsible for 
reproductive isolation (see speciation) 
between sibling species of Drosophila. 

homeodomain See homeobox. 

homeogenes HOMEOBOx-containing genes 
whose protein products are all transcription 
factors; of wide (possibly universal) occur- 
rence in animals, from cnidarians to ver- 
tebrates, but also present in other 
eukaryotes (plants, fungi and slime moulds, 
at least). Homeogenes include all homeotic 
and some segmentation genes. One subset 
of homeotic genes, the Hox (Antennapedia- 
like) homeogene family, is involved in 
encoding the relative positions of structures 
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ABSTRACT The structures of the compounds we call 3a, 
3b, and 3c— compounds that Incorporate (0 the tripyrrole 
peptide of the minor-groove-binding dlstamycin class of com- 
pounds and (//) polyamlne Uganda that extend from the minor 
groove and can Interact with phosphodiester bond*— were 
arrived at by computer-graphics designing by using the x-ray 
structure of dlsta mycin A complexed In the minor groove of 
d(CGCAAATTTGCG) a . Compounds 3a, 3b, and 3c are elab- 
orations of dlstamycin analog 2, designed for Improved stability 
In solution and easier synthesis and purification, which itself 
binds weakly "to DNA; Compounds 3a, 3b, and 3c have been 
synthesized, and Uie Interaction of dtstamydn A, 2» 3a, 3b, and 
3cw!u><^uiymusbNA 
dC), pBR3^ supertellcal piasmld DNA 
T4 collphage DNA have been studied. The following pertinent 
conclusions can be drawn. Binding of 3a, 3b, and 3c occurs In 
the minor groove of DNA and, because of favorable electro- 
static Interaction of dlprotonated polyamine side chains and 
DNA phosphodiester linkages, the tenacity of DNA binding and 
site specificity of 3a, 3b, and 3c are comparable to that of native 
dlstamycin A. 3b has been found to Induce changes In the 
superhellcal density of pBR322 plasmid DNA. The study 
establishes that the central pyrrole N— CH 3 substltuent of 2 can 
be replaced by bulky polyamine metal llgands to create any 
number of compounds that bind Into the minor groove at 
A+T-rich sites and are putative catalysts for the hydrolysis of 
DNA. 

it has been estimated that the half-life for the hydrolysis of a 
simple dialkyl phosphate ester at pH 7.0 is about 200 million 
years (1). It is apparent then why phosphodiester bonds link 
the letters of the genetic code. Chemists have yet to design 
and prepare worthwhile catalysts for the hydrolysis of dialkyl 
phosphate esters. This remains a desirable goal. Having small 
molecules that catalyze the hydrolysis of DNA at given 
sequences could be of great advantage in the study of DNA 
structure and function. 

The desired characteristics of a small molecule capable of 
hydrolyzing DNA would include: (i) its binding to a specific 
base sequence of DNA, and (//) a catalytic site holding metal 
ions or protons and a nucteophile juxtaposed to the phos- 
phodiester bond. The hydrolysis of DNA phosphodiester 
linkages is catalyzed by nuclease enzymes, which require 
metal ions for their activity. Both Mg 2 * and Zn 2+ are directly 
involved in the 3'-to-5' exonuclease activity of the Klenow 
fragment of DNA polymerase I from Escherichia colt. Evi- 
dence for the catalytic role of the metal tons comes from the 
x-ray crystailugraphic data of a cccrysia! of DNA arid ilic 
Klenow fragment (2, 3). It has been proposed that the 
established (4) in-line displacement of the 5' oxygen is by 

The publication costs of this article were defrayed in part by page charge 
payment. This article must therefore be hereby marled "advertisement" 
in accordance with 18 U.S.C. 51734 solely to indicate this fact 



HO" ligated to the Zn 2+ and that the incipient 5' oxyanion 
leaving group is coordinated by Mg 2 \ A combination of two 
metals has also been ^ported as essential to phosphodiester 
hydrolysis by E. coli alkaline phosphatase (5) and phospho- 
lipid C from Bacillus cercus (6). There are several model 
studies that support roles for metal ions in phosphodiester 
bond hydrolysis (7, 8), Positively charged arginines in the 
active site of hydrolytic nucleases (9) have also been sug- 
gested to play an important role in catalysis by interacting 
with the negatively charged dialkyl phosphate ester substrate 
(10). It is not inconceivable that a single low molecular weight 
molecule could play both types of catalytic roles. Thus, a 
catalytic protonated polyamine might also exhibit catalysis 
by ligation of the proper metal ion (11). 

Wc report at this time the design, synthesis, and DNA 
binding of representatives of a class of compounds (3a, 3b, 
and 3c in Fig. 1) that incorporate the tripyrrole peptide of the 
minor-groove-binding distamycin (compound 1 in Fig. 1) 
class of compounds and, in addition, polyamine ligands that 
can interact with phosphodiester bonds and extend outwards 
from the minor groove. The complexity of the compounds we 
report here is sufficient to provide an answer to our initial 
concern as to whether steric hindrance by the bulky ligands 
and their tether prevents minor-groove binding or if electro- 
static interaction of the putative catalytic groups with the 
phosphodiester linkage enhances minor-groove binding. We 
have found the latter situation to prevail. 

MATERIALS AND METHODS 
Materials. Distamycin A and ethidium bromide (EtdBr) 
were purchased from Sigma. T4 coliphage DNA was from 
Sigma and was extracted with phenol before use. Sonicated 
calf thymus DNA with an average length of 500 base pairs, 
pBR322 plasmid DNA, and all synthetic polynucleotides 
were from Pharmacia. Topoisomerase 1 from calf thymus was 
purchased from GIBCO/BRL. 

Synthesis of Distamycin Analogs. Synthetic sequences for 
preparation of distamycin analogs 3a t 3b, and 3c are shown 
in Fig. 2 along with the conditions and reagents used. 
Compounds 2, 4a, and 4b were synthesized following similar 
procedures. With the exception of the use of diethyl cyano- 
phosphonate (DECP) as the coupling reagent for peptide 
synthesis, the procedures used to create the tripeptide from 
its monomers are basically similar to the published methods 
(12-15). Detailed procedures will be published elsewhere. 
The structural identification of 3a. 3h. and 3c was established 
by Fourier transform IR (Perkin-Elmer 1600), ! H NMR 
(General Electric GN-500), low-resolution MS (VG 1 1-250), 
and elemental analysis. 



Abbreviations: DECP, diethyl cyanophosphonate; EtdBr, ethidium 
bromide. 
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Flo. 1, Structures of compounds l-4b. 

Compound 3a. Analysis calculated for C 3 7HmNii0 4 + 
1.5H 2 0: C, 59.18; H, 8.59; N, 20.52. Found: C, 58.94; H, 
8.29; N, 20.16. 

Compound 3b. Analysis calculated for C 3H H M NnOi + 
4H 2 0: C 56.34; H, 8.83; N, 19.02. Found: C, 56.54; H, 8.63; 
N, 18.72. 

Compound 3c. Analysis calculated for Cj 9 H M N n 0 4 + 
2.5H 2 0: C, 58.77; H, 8.85; N, 19.33. Found: C f 58.79; H, 
8.86; N, 18.76. 

Molecular Modeling and Computational Analysis. These 
procedures were carried out on a Silicon Graphics (Mountain 
View, CA) Iris 4D/220GTX workstation using CHARM m (16) 
(version 21.2) and quanta (version 3.0) programs (Polygen, 
Waltham, MA). Coordinates for the distamycin A-d(CG- 
CAAATTTGCG) 2 complex were from the Brookhaven Pro- 
tein Databank (17). Coordinates for the amine ligand 
-N(CH 2 CH 2 CH 2 NMe 2 )2 and for aliphatic chains (CHJ,, (n = 
3, 4, and 5) tethered on pyrrole nitrogen were generated by 
using the chemNote program in quanta. Atomfc partial 
charges of the atoms of 2, 3a, 3b, and 3c were calculated by 
using the mndo program (18) incorporated in quanta. 

DNA Binding Affinities. The EtdBr displacement method 
(19) was used to determine DNA binding affinities. Fluores- 
cence measurements were performed on a LS-50 Perkin- 
Elmer spectrofluorometer with 546 nm as excitation wave- 
length and 591 nm as the emission wavelength. Standard 
solutions for fluorescence measurements contained 40 mM 
NaCI, 25 mM Tris (pH 7.5). 2 mM DNA, and 2.6 mM EtdBr. 
UV-visible absorption spectra were obtained, and spectral 
analysis and data fitting were performed with an OLIS 
(Athens, GA) Cary-14 spectrophotometer and OLIS soft- 
ware. Uel electrophoresis was performed on an 115 horizontal 
electrophoresis apparatus from GIBCO/BRL. 

RESULTS AND DISCUSSION 

Peter Dervan and associates (20) have developed molecules 
that bind in the minor groove of DNA with high selectivity for 
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Fig. 2. Synthesis of 3a, 3b, and 3c. Reagents and conditions for 
synthetic steps aW: «, K, Kl, a<CH 2 > w CH<OEth, m » 2, 3, and 4 f 
dimethyl formamide (DMF). 80°C, 5 hr; b> 90% acetone/10% H 2 0. 
pyridinium p-totuenesutfonate, reflux, 10 hr; c, HN(CH 2 CH 2 CH r 
NMe 2 ) 2 , NaBHjCN, MeOH, AcOH, room temperature (RT), 72 hr; 
«/. H 2 , 10% palladium on activated carbon (Pd/C), MeOH, RT, 1 atm; 

yV-methyl-4-nitropyrrole-2-carboxy)ic acid, DECP, EtjN, DMF, 
RT, 10 hr;/. NaOH, 80% EtOH/20% HaO, reflux, 24 hr; g, 10% HCI. 
0°C, to pH « 7; A, H 2 N(CH 2 ),Me 2 , DECP. tetrahydrofuran. RT, 10 
hr; /, H 2 . 10% Pd/C, MeOH. RT, 1 atm;/ 5, DECP, EtjN, DMF, RT, 
10 hr; *, H 2 , 10% Pd/C. MeOH, RT, 1 aim; /, CH 3 COCI, Et 3 N, DMF, 
RT, 10 hr. (1 atm « 101.3 kPa.) 

A+T-rich sequences and catalyze DNA oxidative cleavage 
by use of Fenton chemistry. An entire subfield now deals 
with the cleavage of DNA by sequence-selective degradation 
of deoxyribose or base entities with redox chemistry (21-23). 
Our ultimate goal is the design of molecules that selectively 
catalyze the hydrolysis of DNA. In the design of the mole- 
cules reported here, we have used Dervan's approach of 
modification of known A+T sequence-selective minor- 
groove-binding molecules. 

Molecular Designing. Molecular designing was based upon 
the x-ray crystallographic structure of distamycin A (com- 
pound 1) complexed in the minor groove of d(CGCAAATT- 
TGCG) 2 (17). As shown by this x-ray structure and numerous 
binding studies (24), the crescent-shaped distamycin tripep- 
tide binds at A-T-rich regions of DNA with a binding constant 
of about 10P M" 1 . When one examines the atoms that 
neighbor the DNA phosphate groups in the l-d(CG- 
CAAATTTGCGfc complex, it becomes immediately evident 
that the DNA phosphate groups have about the same peri- 
odicity as do the distamycin pyrrole rings. Further, the 
carbon of the /V-methyl groups substituted on the pyrrole 
rings of 1 are located about 5 A away from the DNA 
phosphate phosphorous atoms (Fig. 3 Top). If 1 or u dis- 
tamycin-like molecule were to be used to provide selective 
binding of a catalytic device for phosphate ester bond hy- 



1 



1 




Fig. 3. (Top) From (he crystal structure of the complex of 
distamycin A with d(CGCA A ATTTGCG ) 2 (ref. 17). it can be seen 
that the distances between the carbons (green, with Corey-Pauling- 
if Koltun solid model) of the pyrrole TV-methyl substituents of distamy- 

cin and the neighboring phosphate phosphorous atoms (yellow, with 
Van der Waals dot surface) is about 5 A. (Middle) A proposed binding 
model for 3a interacting with A-T-rich region of DNA. Coordinates 
for the DNA and distamycin were from the x-ray crystal structure of 
the distamycin A complex with d(CGCAAATTTGCG) 2 Iref. 17), 
while the coordinates for the -(C^bNtC^CrhCHzNMeih side 
chain was generated in chemNote of quanta. Energy minimization 
using the steepest descent and adopted-basis Newton Raphson 
method were performed before submitting these coordinates for 
molecular dynamics (Verlet) calculation. Coordinates displayed 
were obtained from the energy-minimized average coordinates of the 
last picosecond out of a 35-psec dynamics calculation. The structure 
shows that the triamine complexes the phosphate by hydrogen 
bonding to two of the phosphate oxygens. (Bottom) Plausible struc- 
ture o f met al-complexed 3a bound in the minor groove of d(CG- 
CAAATTTGCGh. Placement is such that a ligated hydroxyl oxygen 
is in line with the departing OS'. 
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drolysis, then the catalytic device should be covalently 
attached in place of an JV-methyl group. It has been estab- 
lished that the replacement of the AJ-methyl substituent of 1 
by N-propyl (25) or even AMsoamyl (26) does not prevent 
binding to DNA. 

Though the molecular designing wa s car ried out by using 
the x-ray structure of l-dCCGCAAATTTGCGh complex, 
our synthetic compounds 3a, 3b, and 3c are, after Wade and 
Dervan (27), elaborations of compound 2 (Fig. 2). The 
rationale for our doing so follows. In the first place, com- 
pound 1 is unstable in solution, and this instability can be 
removed by replacing the formamido and 3-amidinopropyl 
groups by the acetamido and 3-dimethylaminopropyl groups 
to provide compound 2. Also, changing the positive amidine 
group of 1 to the neutral tertiary amino group of 2 makes both 
synthesis and purification much easier. In particular, one is 
not bothered by the unwanted side reaction to peptide 
synthesis of the reaction of the coupling reagent DECP with 
the amidine group to form a phosphatide. The x-ray struc- 
ture of l-d(CGCAAATTTGCG) 2 complex shows that the 
terminal amidine cationic group in distamycin lies deep in the 
minor grove. The pK„ value for the terminal tertiary a>. ' : 
group in 2, 3a, 3b, and 3c is about 9.5, so that it shou. je 
protonated at neutral pH. Modeling establishes that the 
conversion of 1 to 2 is not accompanied by unwanted steric 
effects. The key rationale for the synthetic scheme designed 
here is that numerous metal-chelating groups such as poly- 
amines can replace the Af-methyl substituent of the central 
pyrrole of 2 with little synthetic difficulty. 

In compounds 3a, 3b, and 3c, the -(CH 2 ) rt -N[CH 2 CII 2 - 
CH 2 N(CH3)2] 2 functional group in which n = 3, 4, and 5, 
respectively (Fig. 1), replaces the Af-methyl substituent of the 
central pyrrole of 2. The pK tt * values of CH 3 N(H)[CH 2 - 
CH 2 CH 2 N(H)(CH 3 ) 2 h 3+ were determined to be 6.8, 9.6, and 
12.2. Therefore, at neutrality the -(CH 2 ) n -N[CH 2 CH 2 CH 2 N- 
(CH 3 ) 2 ] 2 substituent exists to a considerable extent in the 
diprotonated form. Electrostatic interaction of the diproto- 
nated side chain may occur with adjacent phosphodiester 
linkages on both DNA strands or with one or more phos- 
phodiester linkages of a single strand (Fig. 3 Middle). In the 
instance of DNA complexes, with two molecules of 3 sitting 
staggered side by side in antiparallel orientation in the minor 
groove (as proposed in ref. 28), electrostatic interactions of 
the two diprotonated polyamine side chains can occur via 
interaction with adjacent phosphodiester linkages on both 
strands. Such structures can be created by computer graphics 
by using as base the two-dimensional NMR structure of the 
2:1 complex of distamycin A-d(CGCAAATTGGC) 2 . It has 
been reported that CH 3 N(CH 2 CH 2 CH 2 NH 2 ) 2 is a moderately 
strong ligand for transition metal ions, such as Ni 2+ , Co 2+ , 
and Cu 2+ (29). Thus, the -N[CH 2 CH 2 CH 2 N(CH 3 ) 2 ] ? end 
group should serve to hold a metal ion adjacent to a single 
phosphodiester linkage. The metal ions ligated by the -N[CH 2 - 
CH 2 CH 2 N(CH 3 ) 2 ] 2 moiety also may be bound to the FO~ of 
the phosphate ester or may act as a carrier of a nucleophilic 
hydroxyl group (Fig. 3 Middle). 

Comparison of Binding Affinities. The binding affinities of 
compounds 1, 2, 3a, 3b, 3c, 4a, and 4b to sonicated calf 
thymus DNA and synthetic polymers poly(dA-dT), poly(dG- 
dC), and poly(dl-dC) have been compared by using the EtdBr 
displacement method (19). The ethidium moiety of the DNA- 
EtdBr complex is quite fluorescent, and binding of agents 
into the minor groove can be monitored because of the 
release of EtdBr, which, when free, is less fluorescent by a 
f?rtor of 50. On a fluorescence scale of 0 to 1, the fractional 
decrease in fluorescence is determined at saturation of DNA 
by the complexing agent (Fig. 4). This value should be 
indicative of the number of DNA binding sites to which the 
agent can complex. The C 50 or concentration of the binding 
agent that displaces half of the bound EtdBr, which is dis- 
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Fio. 4. Fluorescence intensity of EtdBr in the presence of calf 
thymus DNA and 1, 2, la. and 4a, respectively. The ethidium 
displacement curves for 3b and 3c are similar to that of 3a, and the 
plot for 4b is similar to that for 4a. Conditions are described in the 
legend of Tabic 1. 

placed at saturation by the agent, has been used as an estimate 
of the binding constant for the agent. The index (C 50 /^ 
decrease in fluorescence at saturation) values for binding of 1, 
2, 3a, 3b, 3c, 4a, and 4b are provided in Table 1. 

Distamycin binds DNA with strong preference for A+T 
sequences (24). The index value obtained here for binding of 

1 to poly(dA-dT) polymer is almost 220 times smaller than the 
index value for binding of 1 to poly(dG-dC). Index value for 
binding to calf thymus DNA, a natural DNA with 58% A+T 
content, is about 70 times smaller than that of poly(dG-dC). 
In short, to replace a similar amount of ethidium, a much 
larger quantity of 1 is needed for sequences of G+C than for 
A+T. The following conclusions may be drawn from the data 
of Table 1. 

(/) The binding of 1 to DNA is much more favorable than 
is the binding of 2. The x-ray structure of the l-d(CG- 
CAAATTTGCG) 2 complex shows that the protonated ami- 
dine substituent of 1 is within hydrogen-bonding distance of 
adenosine-4 of the DNA substrate. The protonated amine 
substituent of 2, which replaces the amidine substituent in 1, 
is not capable of such hydrogen-bond formation. The more 
dispersed positive charge and hydrogen-bonding capability of 
the amidine structure allows greater interaction than does the 
tertiary amine. 

Hi) The index of binding of 2 to DNA serves as a standard 
to compare binding indexes of 3a, 3b, and 3c. We anticipated 
either a decrease in binding of 3a f 3b, and 3c as compared with 

2 because of steric hindrance brought about by exchange of 
N-(CH 2 )„N[CH 2 CH 2 CH 2 N<CH 3 ) 2 I 2 for the W-methyl or an 
increase in binding because of the electrostatic interaction of 
the protonated tertiary amine of 3a, 3b, and 3c and the DNA 
phosphate groups. The latter situation prevails, so that the 
electrostatic interaction of DNA phosphate groups and the 
protonated polyamine side chain of 3a, 3b, and 3c overcome 
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the loss in binding capacity caused by loss of hydrogen 
bonding when the amidine group of distamycin is exchanged 
with a dimethylamine group to provide 2. Bifurcated hydro- 
gen bonds found between 1 and DNA in the x-ray and NMR 
studies (17, 30) are probably missing for 3 should electrostatic 
interactions with phosphates prevail. 

(iii) Indices for 3a, 3b, and 3c are comparable to one 
another and much the same as the indices for 1 in the 
complexingof calf thymus tity IA; poly(d A-dT), poly(dG-dC) t 
and poly(dl-dC). PoIy(dl^lC) is the same as poly(dG-dC) 
except there is no amino group on position 2 of each guanine 
base protruding in the minor groove; The fact that there is 
strong binding to poly(dl-dC) indicates that 3a, 3b t and 3c 
share with 1 a minor-groove-binding mode. This conclusion 
is supported by a study of the interaction of 3a with T4 
coliphage DNA monitored by the change in extinction coef- 
ficient and bathochromic shift of the longer wavelength 
absorption band of 3a on complexation by DNA. The of 
3a has a similar red shift in binding to T4 DNA and to calf 
thymus DNA. T4 coliphage DNA is glycosylated throughout 
the major groove (31). Thus, complexing of 3a by T4 DNA 
should only occur in the minor groove. In accordance with 
the ethidium displacement results, 3a, 3b, and 3c have the 
same base specificity and groove-binding characteristics to 
DNA as does 1. The same conclusion is reached from spectral 
analysis. The extent of the red shift follows the order 
poIy(dA-dT) (19 nm) « calf thymus DNA (18 nm) > poly(dG- 
dC) (6 nm). Thus, the bulky pyrrole N-substituents of 3a, 3b, 
and 3c with associated positive charges do not alter the 
specificity for DNA binding, which is due to the poly(pyrrole 
amide) structure found in netropsin and distamycin. Binding 
a molecule of 3a, 3b, and 3c may, like distamycin, occupy a 
site of 4 or 5 base pairs and, therefore, may displace the same 
amount of bound ethidium as does 1. If this is so, and since 
the indices of 1, 3a, 3b, and 3c are comparable, it could be 
concluded that the equilibrium binding constants for 3a, 3b, 
and 3c are comparable to that of 1 (20) with poly(dA-dT) (10 6 
M" 1 ) and poly(dG-dC) (10 4 M" 1 ). 

(iV) Compounds 4a and 4b are derived from 2 by exchange 
of an N-CH 3 for the bulky and nonfunctional -(CH 2 ) fl - 
CH(OCH 2 CH,) 2 substituents (n = 2 and 3, respectively). 
Thus, the binding of 4a and 4b should resemble the binding 
of 3a and 3b if the latter did not involve an electrostatic 
component. Binding of 4a and 4b to DNA is a bit weaker than 
is the DNA binding of 2. The index values for 4a and 4b are 
equal. At saturating concentrations of 4a and 4b, little ethid- 
ium is displaced, but some preference for A+T binding can 
still be found. The UV spectra of 4a or 4b show little or no 
change upon addition of DNAs. These data suggest that, due 
to steric constraints, the binding of 4a and 4b is restricted to 
a few A+T-rich sites and may also involve an outside 
electrostatic binding mode. 

Interaction of 3a and 3b with Superhelical Ptasmid pBR322 
DNA. This interaction results in the relaxation of the pBR322 
DNA in the absence of any reducing agents or metal ions 



Table 1. Index values for binding of distamycin A (compound 1) and its analogs to DNA 



Compound 



ctDNA 



Poly(dA-dT) 



Poly(dG-dC) 



Poly(dMC) 



I 


0.28/87 




0.003 


0.12/96 




0.001 


6.5/30 = 


0.217 


0.18/96 


= 0.002 


2 


5.4/65 




0.083 


4.5/85 




0.053 


5.5/20 * 


0.275 


6.0/90 


= 0.067 


3a 


0.50/82 




0.006 


0.21/93 




0.002 


10/37 = 


0.270 


0.38/90 


= 0.004 


3b 


0.35/90 




0.004 


0.21/93 




0.002 


8.0/40 = 


0.200 


0.38/90 


= 0.004 


3c 


0.35/95 




0.004 


0.21/93 




0.002 


8.0/40 = 


0.200 


0.38/90 


= 0.004 


4a 


5.5/40 




0.138 


7.5/70 




0.107 


5.5/20 = 


0.275 


8.4/48 


= 0.175 


4b 


5.5/40 




0.138 


7.5/70 




0.107 


5.5/20 - 


0.275 


8.1/52 


- 0.156 



Index values = Csa/% decrease of fluorescence at saturation; for details, see text. Typically, a 3-ml 
solution containing 40 mM NaCI, 25 mM Tris (pH 7.5). 2 /iM calf thymus DNA (ctDNA) DNA, and 
2.6 jiM EtdBr was titrated with a concentrated solution of distamycin or an analog (100 pM to 1 mM), 
and fluorescence was excited at 546 nm and measured at 591 nm at 37°C. 
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Fio. 5. The topoisomer; patterns of pBR322 DN A obtained after 
relaxation with excess calf tHymus topoisomerase I in the presence 
of 0, 10, 20, 70j [ 80 /aM; respectively (lanes 1-5). of 3b. In each 
sample, 5 /xg of negatively supercoiled pBR322 was incubated ut 37°C 
for 17.5 hr with 20 units of topoisomerase I in a 5M solution 
containing 1 mM EDTA. 50 mM KG, 10 mM MgCI 2 , 50 mM Tris (pH 
7.5), and 3b. After relaxation, the DNA was extracted twice with 
phenol equilibrated with 20 mM Tris (pH 8.0). This was followed by 
a chloroform/octanol. 24:1 (vol/vol), extraction. The resulting so- 
lution was passed through a NACS PREPAC cartridge (OIBCO/ 
BRL) and iyophilized. After lyophilization, aliquots of DNA were 
loaded on a l%^rose gel at 1.5 V/cm for 17 hr" in E buffer (40 mM 
Tris/20 mM sodium acetate/2 mM EDTA, pH adjusted to 7.7 with 
glacial acetic acid); ;'ThVgels welvlulined with diluted EtdBr solution 
and photographed on a UV transilluminator with Polaroid type 665 
films. 

(data not shown). It was found, in the presence of 3a or 3b, 
that the amount of conversion of supercoiled DNA form 1 to 
the relaxed form U did not continuously increase with time, 
nor was the observed conversion proportional to the con- 
centration of 3a or 3b used. Therefore, rather than DNA 
strand breakage, mechanisms such as unwinding of super- 
coiled DNA must be responsible for the relaxation effect of 
3a or 3b on pBR322. 

To explore these possibilities, topoisomer families of 
pBR322 were prepared by relaxing negatively supercoiled 
DNA with calf thymus topoisomerase I (32, 33) in the 
presence of 3b. Supercoiled plasmid pBR322 was completely 
relaxed with an excess amount of topoisomerase I in the 
absence of 3b (Fig. 5, lane 1). With increasing concentrations 
of 3b, the gel shows (lanes 2 to 5) a family of bands reflecting 
the presence of DNA species with varying extents of super- 
heltcities. Increasing concentrations of 3b first relocates the 
center of the Boltzmann distribution and shifts it toward 
DNA bands with higher superhelicities that will migrate 
faster than relaxed DNA on the agarose gel. Further increase 
in 3b concentration decreases the bands of higher superhe- 
licities back to a relaxed form. These results provide direct 
evidence for the unwinding ability of 3b. At comparable 
concentrations, the change in superhelicity followed the 
order netropsin (34) =* 3b » 1. 

In summary, we have designed molecules (3a, 3b, and 3c) 
in which the central pyrrole /V-methyl group of compound 2, 
which displays weak A+T minor-groove binding, is replaced 
by the positively charged melal-chelating polyamine group 
-<CH 2 ) n N[CH 2 CH 2 CH 2 N(CH 3 ) 2 ] 2 . We have shown that 3a, 
3b, and 3c are strong minor-groove binders, and this feature 
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must be associated with the electrostatic interaction of the 
positively charged polyamine side chain with phosphodiester 
linkages. This binding mode results in the unwinding of 
supercoiled DNA. Compounds such as 3a t 3b, and 3c muy be 
useful in exploring DNA conformations. These results es- 
tablish that the synthesis of putative catalysts for DNA 
hydrolysis, based upon the general concepts embodied in 
compounds 3a, 3b, and 3c is worthy of pursuit. 
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